GNU/Weeb Mailing List <[email protected]>
 help / color / mirror / Atom feed
* [PATCH fb v1 0/3] Facebook Onion assets optimization
@ 2023-05-12 18:44 Ammar Faizi
  2023-05-12 18:44 ` [PATCH fb v1 1/3] fb: web: Don't use proxy if the host isn't an onion domain Ammar Faizi
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Ammar Faizi @ 2023-05-12 18:44 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Michael William Jonathan, GNU/Weeb FB Team, GNU/Weeb Mailing List

Hi,

Facebook Onion CDN is slow...

Replace Facebook onion asset endpoint with non-onion for faster response
time. Avoid HTTP request to Facebook onion if possible. When using
Facebook onion, the CDN asset URL looks like this:

  https://scontent.xx.facebookcooa4ldbat4g7iacswl3p2zrf5nuylvnhxn6kqolvojixwid.onion/something

we can simply replace the domain with scontent.xx.fbcdn.net to get the
same asset:

  https://scontent.xx.fbcdn.net/something

Side note: We don't fully understand how Facebook actually manages their
CDN. We may introduce a subtle issue by doing it this way. But we hope
we don't.

There are 3 patches in this series:

1. Don't use proxy if the host isn't an onion domain.

Speed up the HTTP request by not using the Tor proxy if the destination
host is not an onion domain. This is also a preparation to handle
Facebook assets (photos, video, files) better and faster.

2. Introduce `build_url()` function.

Introduce build_url() function to construct a URL based on the return
value of parse_url(). Currently, the only purpose of this function is to
easily change the hostname without inventing our own URL parser. This
function is taken from an answer on the stackoverflow site. I put the
stackoverflow link in the commit message.

3. Replace Facebook onion asset endpoint with non-onion.

We're mucking around with the URL here, in patch #3.

Signed-off-by: Ammar Faizi <[email protected]>
---

The following changes since commit 68e95a61956e75ad08ad0bb68f10172fd2883816:

  Merge branch 'dev.cache' (Facebook scraper cache) (2023-05-09 17:59:03 +0700)

are available in the Git repository at:

  https://gitlab.torproject.org/ammarfaizi2/Facebook.git dev.fast_asset

for you to fetch changes up to 8a74bcd85a0ea781ed23c77527af288abcfff900:

  fb: web: Replace Facebook onion asset endpoint with non-onion (2023-05-13 01:35:01 +0700)

----------------------------------------------------------------
Ammar Faizi (3):
      fb: web: Don't use proxy if the host isn't an onion domain
      fb: helper: Introduce `build_url()` function
      fb: web: Replace Facebook onion asset endpoint with non-onion

 src/Facebook/helpers.php | 18 ++++++++++++++++++
 web/public/api.php       | 18 ++++++++++++++++++
 2 files changed, 36 insertions(+)

base-commit: 68e95a61956e75ad08ad0bb68f10172fd2883816
-- 
Ammar Faizi


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH fb v1 1/3] fb: web: Don't use proxy if the host isn't an onion domain
  2023-05-12 18:44 [PATCH fb v1 0/3] Facebook Onion assets optimization Ammar Faizi
@ 2023-05-12 18:44 ` Ammar Faizi
  2023-05-12 18:44 ` [PATCH fb v1 2/3] fb: helper: Introduce `build_url()` function Ammar Faizi
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Ammar Faizi @ 2023-05-12 18:44 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Michael William Jonathan, GNU/Weeb FB Team, GNU/Weeb Mailing List

Speed up the HTTP request by not using the Tor proxy if the destination
host is not an onion domain. This is also a preparation to handle
Facebook assets (photos, video, files) better and faster.

Signed-off-by: Ammar Faizi <[email protected]>
---
 web/public/api.php | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/web/public/api.php b/web/public/api.php
index a427191b21639279..1b9ef97a2fb9c868 100644
--- a/web/public/api.php
+++ b/web/public/api.php
@@ -219,6 +219,16 @@ function handle_url_proxy(Facebook $fb, string $url)
 		return 0;
 	}
 
+	if (filter_var($data, FILTER_VALIDATE_URL)) {
+		/**
+		 * Don't use proxy for non onion URL.
+		 */
+		$u = parse_url($data);
+		if (!preg_match("/\\.onion$/i", $u["host"])) {
+			$fb->setProxy(NULL);
+		}
+	}
+
 	if (!fb_http_get($fb, $data))
 		exit(0);
 }
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH fb v1 2/3] fb: helper: Introduce `build_url()` function
  2023-05-12 18:44 [PATCH fb v1 0/3] Facebook Onion assets optimization Ammar Faizi
  2023-05-12 18:44 ` [PATCH fb v1 1/3] fb: web: Don't use proxy if the host isn't an onion domain Ammar Faizi
@ 2023-05-12 18:44 ` Ammar Faizi
  2023-05-12 18:44 ` [PATCH fb v1 3/3] fb: web: Replace Facebook onion asset endpoint with non-onion Ammar Faizi
  2023-05-12 19:10 ` [PATCH fb v1 0/3] Facebook Onion assets optimization GNU/Weeb Facebook Team
  3 siblings, 0 replies; 5+ messages in thread
From: Ammar Faizi @ 2023-05-12 18:44 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Michael William Jonathan, GNU/Weeb FB Team, GNU/Weeb Mailing List

Introduce build_url() function to construct a URL based on the return
value of parse_url(). Currently, the only purpose of this function is to
easily change the hostname without inventing our own URL parser.

This function is taken from an answer on the stackoverflow site.

Link: https://stackoverflow.com/a/35207936/7275114
Signed-off-by: Ammar Faizi <[email protected]>
---
 src/Facebook/helpers.php | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/src/Facebook/helpers.php b/src/Facebook/helpers.php
index 224ec45d70755407..ae50775908ceb71f 100644
--- a/src/Facebook/helpers.php
+++ b/src/Facebook/helpers.php
@@ -66,3 +66,21 @@ function full_html_clean(string $m): string
 
 	return trim(implode("\n", $m));
 }
+
+/**
+ * @param array $parts
+ * @return string
+ */
+function build_url(array $parts): string
+{
+	return (isset($parts['scheme']) ? "{$parts['scheme']}:" : '') .
+		((isset($parts['user']) || isset($parts['host'])) ? '//' : '') .
+		(isset($parts['user']) ? "{$parts['user']}" : '') .
+		(isset($parts['pass']) ? ":{$parts['pass']}" : '') .
+		(isset($parts['user']) ? '@' : '') .
+		(isset($parts['host']) ? "{$parts['host']}" : '') .
+		(isset($parts['port']) ? ":{$parts['port']}" : '') . 
+		(isset($parts['path']) ? "{$parts['path']}" : '') .
+		(isset($parts['query']) ? "?{$parts['query']}" : '') .
+		(isset($parts['fragment']) ? "#{$parts['fragment']}" : '');
+}
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH fb v1 3/3] fb: web: Replace Facebook onion asset endpoint with non-onion
  2023-05-12 18:44 [PATCH fb v1 0/3] Facebook Onion assets optimization Ammar Faizi
  2023-05-12 18:44 ` [PATCH fb v1 1/3] fb: web: Don't use proxy if the host isn't an onion domain Ammar Faizi
  2023-05-12 18:44 ` [PATCH fb v1 2/3] fb: helper: Introduce `build_url()` function Ammar Faizi
@ 2023-05-12 18:44 ` Ammar Faizi
  2023-05-12 19:10 ` [PATCH fb v1 0/3] Facebook Onion assets optimization GNU/Weeb Facebook Team
  3 siblings, 0 replies; 5+ messages in thread
From: Ammar Faizi @ 2023-05-12 18:44 UTC (permalink / raw)
  To: Ammar Faizi
  Cc: Michael William Jonathan, GNU/Weeb FB Team, GNU/Weeb Mailing List

... for faster response time.

Avoid HTTP request to Facebook onion if possible because using Tor
network is slow. When using Facebook onion, the CDN asset URL looks like
this:

  https://scontent.xx.facebookcooa4ldbat4g7iacswl3p2zrf5nuylvnhxn6kqolvojixwid.onion/something

we can simply replace the domain with scontent.xx.fbcdn.net to get the
same asset:

  https://scontent.xx.fbcdn.net/something

Side note: We don't fully understand how Facebook actually manages their
CDN. We may introduce a subtle issue by doing it this way. But we hope
we don't.

Signed-off-by: Ammar Faizi <[email protected]>
---
 web/public/api.php | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/web/public/api.php b/web/public/api.php
index 1b9ef97a2fb9c868..b2e19789dbca21c6 100644
--- a/web/public/api.php
+++ b/web/public/api.php
@@ -165,6 +165,14 @@ function rewriteOnionURL(?string $str): ?string
 		return $str;
 	}
 
+	/**
+	 * Don't use Facebook onion CDN for performance reasons.
+	 */
+	if (preg_match("/^scontent.xx.face.+?\.onion$/", $p["host"])) {
+		$p["host"] = "scontent.xx.fbcdn.net";
+		return build_url($p);
+	}
+
 	$signature = md5($str.API_SECRET, true);
 
 	/**
-- 
Ammar Faizi


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH fb v1 0/3] Facebook Onion assets optimization
  2023-05-12 18:44 [PATCH fb v1 0/3] Facebook Onion assets optimization Ammar Faizi
                   ` (2 preceding siblings ...)
  2023-05-12 18:44 ` [PATCH fb v1 3/3] fb: web: Replace Facebook onion asset endpoint with non-onion Ammar Faizi
@ 2023-05-12 19:10 ` GNU/Weeb Facebook Team
  3 siblings, 0 replies; 5+ messages in thread
From: GNU/Weeb Facebook Team @ 2023-05-12 19:10 UTC (permalink / raw)
  To: Ammar Faizi; +Cc: Michael William Jonathan, GNU/Weeb Mailing List

The pull request you sent on Sat, 13 May 2023 01:44:08 +0700:

> https://gitlab.torproject.org/ammarfaizi2/Facebook.git dev.fast_asset

has been merged into ammarfaizi2/Facebook.git:
https://github.com/ammarfaizi2/Facebook/commit/dbc3dfcc32af4b934c0ace3dd7efbfe9878f133f

Thank you!

[1/3] fb: web: Don't use proxy if the host isn't an onion domain
      commit: 2b387f821f5b538dd6f566f74c0013cb13e2bd3b
[2/3] fb: helper: Introduce `build_url()` function
      commit: 4561609dccb7237ceeeaba2a4f594130b4c0b982
[3/3] fb: web: Replace Facebook onion asset endpoint with non-onion
      commit: 8a74bcd85a0ea781ed23c77527af288abcfff900

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-05-12 19:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-12 18:44 [PATCH fb v1 0/3] Facebook Onion assets optimization Ammar Faizi
2023-05-12 18:44 ` [PATCH fb v1 1/3] fb: web: Don't use proxy if the host isn't an onion domain Ammar Faizi
2023-05-12 18:44 ` [PATCH fb v1 2/3] fb: helper: Introduce `build_url()` function Ammar Faizi
2023-05-12 18:44 ` [PATCH fb v1 3/3] fb: web: Replace Facebook onion asset endpoint with non-onion Ammar Faizi
2023-05-12 19:10 ` [PATCH fb v1 0/3] Facebook Onion assets optimization GNU/Weeb Facebook Team

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox