public inbox for [email protected]
 help / color / mirror / Atom feed
From: Ammar Faizi <[email protected]>
To: GNU/Weeb Mailing List <[email protected]>
Cc: GNU/Weeb FB Team <[email protected]>,
	Michael William Jonathan <[email protected]>
Subject: Introducing Facebook Scraper API (with Tor network support)
Date: Wed, 3 May 2023 19:37:07 +0700	[thread overview]
Message-ID: <[email protected]> (raw)


We are open-sourcing a new project: Facebook Scraper API with Tor
network support. It's fully written in PHP (yeah PHP, feel free to argue
PHP is dead: it's not!). Currently, it can only scrape text and photo
posts. I'll be adding the video support in the near future. Many
features will come soon.

This project is licensed under the GPLv2 license, which is my default
open-source license choice.

Comments and patches, welcome...

Patches and any inquiry about this project should be directed to:

To: Ammar Faizi <[email protected]>
Cc: Michael William Jonathan <[email protected]>
Cc: GNU/Weeb Mailing List <[email protected]>
Cc: GNU/Weeb FB Team <[email protected]>

The following changes since commit 75065abdba76e40f102041b70c2edaf4bf902259:

  fb: Initial commit (2023-05-01 00:48:36 +0700)

are available in the Git repository at:

  https://gitlab.torproject.org/ammarfaizi2/Facebook.git master

for you to fetch changes up to 0d5e59e00359e165778a81f80122bb522f8edb0f:

  Merge branch 'rewrite_url' (Facebook Onion rewrite support) (2023-05-03 18:46:47 +0700)

----------------------------------------------------------------
Ammar Faizi (33):
      fb: Create the initial 'Post' trait (getTimelineYears)
      fb: Create user cache mechanism
      fb: Post: Handle a 'get timeline years' edge case
      fb: Post: Create getTimelinePosts method
      fb: Post: Make getTimelineYears() more reliable
      fb: Post: Implement getPost() function
      fb: web: Create initial web API
      fb: Post: Handle not found in getTimelineYears()
      fb: Post: Fix stupid indentation
      fb: web: Add getPost() endpoint
      fb: helpers: Replace '</p>' with double lines instead of single line
      fb: web: Create 'logs' directory for web server logs
      fb: composer.json: Remove phpunit from require-dev
      fb: Use CURLPROXY_SOCKS5_HOSTNAME as proxy type
      fb: helpers: Trim the end result of full_html_clean()
      fb: Post: Split parsing logic in getPost()
      fb: Post: Split info parser
      fb: Post: Implement tryParsePhotoPost()
      fb: Post: Introduce `$take_content` argument in `getTimelinePosts()`
      fb: Post: Introduce `$limit` argument in `getTimelinePosts()`
      fb: web: Invert the getTimelinePosts() condition
      fb: web: Integrate `$take_content` and `$limit` args
      fb: Post: Switch `content` and `info` key position
      fb: Post: Parse the embedded link in a post
      fb: web: Create `httpGet()` API for visiting FB onion endpoints
      fb: Post: Introduce rewrite URL callback
      fb: web: Provide a proxy to access onion endpoints
      fb: Post: Call cleanURL() on the img_preview URL
      fb: web: Fix `is_compressed` value
      fb: web: Supress gzinflate error
      fb: web: Don't rewrite non Facebook onion URL
      Merge branch 'post' (initial FB post scraper API)
      Merge branch 'rewrite_url' (Facebook Onion rewrite support)

 auth.example.php              |   2 +
 composer.json                 |   3 -
 main.php                      |  45 +++++
 src/Facebook/Facebook.php     |  93 ++++++++-
 src/Facebook/Methods/Post.php | 422 ++++++++++++++++++++++++++++++++++++++++
 src/Facebook/helpers.php      |  54 +++++
 web/.gitignore                |   1 +
 web/logs/.gitignore           |   2 +
 web/public/api.php            | 268 +++++++++++++++++++++++++
 9 files changed, 885 insertions(+), 5 deletions(-)
 create mode 100644 src/Facebook/Methods/Post.php
 create mode 100644 web/.gitignore
 create mode 100644 web/logs/.gitignore
 create mode 100644 web/public/api.php

-- 
Ammar Faizi


                 reply	other threads:[~2023-05-03 12:37 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    [email protected] \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox