Updated readme

This commit is contained in:
Hashir Omer
2022-09-21 09:48:31 +05:00
committed by GitHub
parent 7a642d53f9
commit 7cd82db818

View File

@@ -8,7 +8,7 @@ The code uses Upwork's internal Api to scrape new jobs posted on Upwork. I am no
# Note
I had to write the script in Golang instead of Python because Upwork filters bots by checking TLS signatures of incoming requests. Unfortunately, I could not
The code uses Golang instead of Python because Upwork filters bots by checking TLS signatures of incoming requests. Unfortunately, I could not
find a way to do it in pure Python because Python is compiled with openssl and popular browsers do not use it. Chrome uses BoringSSl and firefox uses NSS.
These SSL libraries use different extensions and cipher suites which makes detection of TLS level configurations a more robust method to detect bot traffic.
@@ -16,4 +16,8 @@ Golang is a more lower level language compared to Python, so it allows changing
# How can you contribute?
The code as of now is written as a throwaway script because I wrote it to scrape some data once. If you intend to include it in your workflow or integrate it in larger codebase, you can contribute back by refactoring the code since I don't know golang at all, I had to Google fu everything.
These are some of the features I think could be useful.
- Add support for automatic proxy rotation. It can be extremely effective when used in conjunction with go routines.
- Add Api schema for Upwork Api.
- Add more scrapers, a lot of logic is platform agnostic which could be used to build scrapers for more platforms.