diff --git a/README.md b/README.md index 3fa6b46..d6e4c10 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ The code uses Upwork's internal Api to scrape new jobs posted on Upwork. I am no # Note -I had to write the script in Golang instead of Python because Upwork filters bots by checking TLS signatures of incoming requests. Unfortunately, I could not +The code uses Golang instead of Python because Upwork filters bots by checking TLS signatures of incoming requests. Unfortunately, I could not find a way to do it in pure Python because Python is compiled with openssl and popular browsers do not use it. Chrome uses BoringSSl and firefox uses NSS. These SSL libraries use different extensions and cipher suites which makes detection of TLS level configurations a more robust method to detect bot traffic. @@ -16,4 +16,8 @@ Golang is a more lower level language compared to Python, so it allows changing # How can you contribute? -The code as of now is written as a throwaway script because I wrote it to scrape some data once. If you intend to include it in your workflow or integrate it in larger codebase, you can contribute back by refactoring the code since I don't know golang at all, I had to Google fu everything. +These are some of the features I think could be useful. + +- Add support for automatic proxy rotation. It can be extremely effective when used in conjunction with go routines. +- Add Api schema for Upwork Api. +- Add more scrapers, a lot of logic is platform agnostic which could be used to build scrapers for more platforms.