mirror of
https://github.com/orangecoding/fredy.git
synced 2026-06-16 12:31:07 +00:00
Compare commits
22 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
5347d0014d | ||
|
|
946b70003f | ||
|
|
a6e6656882 | ||
|
|
fbea1aabc4 | ||
|
|
2dd01ca38f | ||
|
|
f010e8951b | ||
|
|
5225098006 | ||
|
|
6e6144e02f | ||
|
|
aa49773a4d | ||
|
|
b6b8d6814c | ||
|
|
b8d658a948 | ||
|
|
bce0c57b02 | ||
|
|
5e547baa76 | ||
|
|
b368ca7ab8 | ||
|
|
eb85641dfb | ||
|
|
0a13037b83 | ||
|
|
5600b9766b | ||
|
|
63b232521e | ||
|
|
2f5cc31ae3 | ||
|
|
70e78492ec | ||
|
|
47adb88cb5 | ||
|
|
e5627e1d02 |
21
CHANGELOG.md
21
CHANGELOG.md
@@ -1,4 +1,21 @@
|
||||
###### [V5.4.0]
|
||||
###### [V5.4.5]
|
||||
- Adding Instana node.js monitoring
|
||||
|
||||
###### [V5.4.4]
|
||||
- Add support for Immo Südwest Presse (immo.swp.de)
|
||||
- Telegram: Use job name instead of ID and link in title
|
||||
- Fix race condition if user ID is in session but not in user store
|
||||
- Allow visiting the original provider URL
|
||||
|
||||
###### [V5.4.3]
|
||||
- re-writing readme
|
||||
- improving docker build
|
||||
- using github's actions to build docker and test automatically
|
||||
|
||||
###### [V5.4.2]
|
||||
- Fixing prod build
|
||||
|
||||
###### [V5.4.1]
|
||||
- Upgrading dependencies
|
||||
- Provider urls are now automagically been changed to include the correct sort order for search results
|
||||
|
||||
@@ -45,4 +62,4 @@ on the new ui and use the values from your previous config file if needed.
|
||||
[BREAKING CHANGES]
|
||||
- The config has been changed, the config of V1.x will not work any longer
|
||||
- Sources have been renamed to provider
|
||||
```
|
||||
```
|
||||
|
||||
@@ -17,7 +17,7 @@ function normalize(o) {
|
||||
return Object.assign(o, { id });
|
||||
}
|
||||
|
||||
//apply blaclist if needed
|
||||
//apply blacklist if needed
|
||||
function applyBlacklist(o) {
|
||||
const titleNotBlacklisted = !utils.isOneOf(o.title, appliedBlackList);
|
||||
const descNotBlacklisted = !utils.isOneOf(o.description, appliedBlackList);
|
||||
|
||||
12
README.md
12
README.md
@@ -69,7 +69,7 @@ yarn run test
|
||||
# Architecture
|
||||

|
||||
|
||||
## Immoscout
|
||||
### Immoscout
|
||||
I have added **experimental** support for Immoscout. Immoscout is somewhat special, because they have decided to secure their service from bots using Re-Capture. Finding a way around this is barely possible. For _Fredy_ to be able to bypass this check, I'm using a service called [ScrapingAnt](https://scrapingant.com/). The trick is to use a headless browser, rotating proxies and (once successfully validated) to re-send the cookies each time.
|
||||
|
||||
To be able to use Immoscout, you need to create an account at ScrapingAnt. Configure the API key in the "General Settings" tab (visible when logged in as administrator).
|
||||
@@ -77,9 +77,15 @@ The rest will be handled by _Fredy_. Keep in mind, the support is experimental.
|
||||
|
||||
If you need more than the 1000 API calls allowed per month, I'd suggest opting for a paid account... ScrapingAnt loves OpenSource, therefore they have decided to give all _Fredy_ users a 10% discount by using the code **FREDY10** (Disclaimer: I do not earn any money for recommending their service).
|
||||
|
||||
#### Contribution guidelines
|
||||
### Contribution guidelines
|
||||
|
||||
See [Contributing](https://github.com/orangecoding/fredy/blob/master/CONTRIBUTING.md)
|
||||
See [Contributing](https://github.com/orangecoding/fredy/blob/master/CONTRIBUTING.md)
|
||||
|
||||
### Monitoring
|
||||
|
||||
_Fredy_ can be monitored by [Instana](https://www.instana.com). If you are interested, sign up for a free trial. This is totally optional of course :)
|
||||
If you want to use Instana to monitor _Fredy_, please change the variable `INSTANA_MONITORING` in the `.env` file to `true`.
|
||||
If you want to know more, head over to the [Instana docs](https://www.ibm.com/docs/en/obi/current?topic=technologies-monitoring-nodejs).
|
||||
|
||||
# Docker
|
||||
Use the Dockerfile in this repository to build an image.
|
||||
|
||||
84
doc/Untitled Diagram.drawio
Normal file
84
doc/Untitled Diagram.drawio
Normal file
@@ -0,0 +1,84 @@
|
||||
<mxfile host="app.diagrams.net" modified="2022-01-29T18:34:51.211Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36" etag="W0jmvptvMSkuHq89hwUy" version="16.5.2" type="github">
|
||||
<diagram id="C5RBs43oDa-KdzZeNtuy" name="Page-1">
|
||||
<mxGraphModel dx="850" dy="907" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="827" pageHeight="1169" math="0" shadow="0">
|
||||
<root>
|
||||
<mxCell id="WIyWlLk6GJQsqaUBKTNV-0" />
|
||||
<mxCell id="WIyWlLk6GJQsqaUBKTNV-1" parent="WIyWlLk6GJQsqaUBKTNV-0" />
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-5" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="WIyWlLk6GJQsqaUBKTNV-3" target="WIyWlLk6GJQsqaUBKTNV-7">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="WIyWlLk6GJQsqaUBKTNV-3" value="Job1" style="rounded=1;whiteSpace=wrap;html=1;fontSize=12;glass=0;strokeWidth=1;shadow=0;fillColor=#dae8fc;strokeColor=#6c8ebf;" parent="WIyWlLk6GJQsqaUBKTNV-1" vertex="1">
|
||||
<mxGeometry x="100" y="50" width="120" height="30" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-8" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="WIyWlLk6GJQsqaUBKTNV-7" target="4kAlOAlRylSy7JMoHAEd-2">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-10" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="WIyWlLk6GJQsqaUBKTNV-7" target="4kAlOAlRylSy7JMoHAEd-3">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-11" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="WIyWlLk6GJQsqaUBKTNV-7" target="4kAlOAlRylSy7JMoHAEd-4">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="WIyWlLk6GJQsqaUBKTNV-7" value="FredyRuntime" style="rounded=1;whiteSpace=wrap;html=1;fontSize=12;glass=0;strokeWidth=1;shadow=0;fillColor=#fff2cc;strokeColor=#d6b656;" parent="WIyWlLk6GJQsqaUBKTNV-1" vertex="1">
|
||||
<mxGeometry x="110" y="120" width="360" height="40" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-6" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="4kAlOAlRylSy7JMoHAEd-0" target="WIyWlLk6GJQsqaUBKTNV-7">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-0" value="Job2" style="rounded=1;whiteSpace=wrap;html=1;fontSize=12;glass=0;strokeWidth=1;shadow=0;fillColor=#dae8fc;strokeColor=#6c8ebf;" vertex="1" parent="WIyWlLk6GJQsqaUBKTNV-1">
|
||||
<mxGeometry x="230" y="50" width="120" height="30" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-7" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="4kAlOAlRylSy7JMoHAEd-1" target="WIyWlLk6GJQsqaUBKTNV-7">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-1" value="Job3" style="rounded=1;whiteSpace=wrap;html=1;fontSize=12;glass=0;strokeWidth=1;shadow=0;fillColor=#dae8fc;strokeColor=#6c8ebf;" vertex="1" parent="WIyWlLk6GJQsqaUBKTNV-1">
|
||||
<mxGeometry x="360" y="50" width="120" height="30" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-13" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="4kAlOAlRylSy7JMoHAEd-2" target="4kAlOAlRylSy7JMoHAEd-12">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-2" value="Provider1" style="rounded=1;whiteSpace=wrap;html=1;fontSize=12;glass=0;strokeWidth=1;shadow=0;fillColor=#d5e8d4;strokeColor=#82b366;" vertex="1" parent="WIyWlLk6GJQsqaUBKTNV-1">
|
||||
<mxGeometry x="100" y="210" width="120" height="40" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-14" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="4kAlOAlRylSy7JMoHAEd-3">
|
||||
<mxGeometry relative="1" as="geometry">
|
||||
<mxPoint x="290" y="290" as="targetPoint" />
|
||||
</mxGeometry>
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-3" value="Provider2" style="rounded=1;whiteSpace=wrap;html=1;fontSize=12;glass=0;strokeWidth=1;shadow=0;fillColor=#d5e8d4;strokeColor=#82b366;" vertex="1" parent="WIyWlLk6GJQsqaUBKTNV-1">
|
||||
<mxGeometry x="230" y="210" width="120" height="40" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-15" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="4kAlOAlRylSy7JMoHAEd-4" target="4kAlOAlRylSy7JMoHAEd-12">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-4" value="Provider3" style="rounded=1;whiteSpace=wrap;html=1;fontSize=12;glass=0;strokeWidth=1;shadow=0;fillColor=#d5e8d4;strokeColor=#82b366;" vertex="1" parent="WIyWlLk6GJQsqaUBKTNV-1">
|
||||
<mxGeometry x="360" y="210" width="120" height="40" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-17" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="4kAlOAlRylSy7JMoHAEd-12" target="4kAlOAlRylSy7JMoHAEd-16">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-12" value="Similarity check" style="rounded=1;whiteSpace=wrap;html=1;fontSize=12;glass=0;strokeWidth=1;shadow=0;fillColor=#e1d5e7;strokeColor=#9673a6;" vertex="1" parent="WIyWlLk6GJQsqaUBKTNV-1">
|
||||
<mxGeometry x="110" y="290" width="360" height="40" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-20" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;entryX=0.5;entryY=0;entryDx=0;entryDy=0;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="4kAlOAlRylSy7JMoHAEd-16" target="4kAlOAlRylSy7JMoHAEd-18">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-16" value="Found similarity" style="rhombus;whiteSpace=wrap;html=1;" vertex="1" parent="WIyWlLk6GJQsqaUBKTNV-1">
|
||||
<mxGeometry x="250" y="360" width="80" height="80" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-21" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;" edge="1" parent="WIyWlLk6GJQsqaUBKTNV-1" source="4kAlOAlRylSy7JMoHAEd-18" target="4kAlOAlRylSy7JMoHAEd-19">
|
||||
<mxGeometry relative="1" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-18" value="Notification Adapter1" style="rounded=1;whiteSpace=wrap;html=1;fontSize=12;glass=0;strokeWidth=1;shadow=0;fillColor=#f8cecc;strokeColor=#b85450;" vertex="1" parent="WIyWlLk6GJQsqaUBKTNV-1">
|
||||
<mxGeometry x="230" y="460" width="120" height="40" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-19" value="Notification Adapter2" style="rounded=1;whiteSpace=wrap;html=1;fontSize=12;glass=0;strokeWidth=1;shadow=0;fillColor=#f8cecc;strokeColor=#b85450;" vertex="1" parent="WIyWlLk6GJQsqaUBKTNV-1">
|
||||
<mxGeometry x="230" y="520" width="120" height="40" as="geometry" />
|
||||
</mxCell>
|
||||
<mxCell id="4kAlOAlRylSy7JMoHAEd-22" value="No" style="text;html=1;resizable=0;autosize=1;align=center;verticalAlign=middle;points=[];fillColor=none;strokeColor=none;rounded=0;" vertex="1" parent="WIyWlLk6GJQsqaUBKTNV-1">
|
||||
<mxGeometry x="300" y="440" width="30" height="20" as="geometry" />
|
||||
</mxCell>
|
||||
</root>
|
||||
</mxGraphModel>
|
||||
</diagram>
|
||||
</mxfile>
|
||||
@@ -62,7 +62,6 @@ class FredyRuntime {
|
||||
if (this._providerConfig.paginate != null) {
|
||||
xray(u, this._providerConfig.crawlContainer, [this._providerConfig.crawlFields])
|
||||
//the first 2 pages should be enough here
|
||||
//TODO: Think about automagically sort by date
|
||||
.limit(2)
|
||||
.paginate(this._providerConfig.paginate)
|
||||
.then((listings) => {
|
||||
|
||||
@@ -1,10 +1,10 @@
|
||||
const service = require('restana')();
|
||||
const jobRouter = service.newRouter();
|
||||
const axios = require('axios');
|
||||
const jobStorage = require('../../services/storage/jobStorage');
|
||||
const userStorage = require('../../services/storage/userStorage');
|
||||
const immoscoutProvider = require('../../provider/immoscout');
|
||||
const config = require('../../../conf/config.json');
|
||||
|
||||
const { isAdmin } = require('../security');
|
||||
|
||||
function doesJobBelongsToUser(job, req) {
|
||||
@@ -30,9 +30,23 @@ jobRouter.get('/', async (req, res) => {
|
||||
});
|
||||
|
||||
jobRouter.get('/processingTimes', async (req, res) => {
|
||||
let scrapingAntData = null;
|
||||
|
||||
if (config.scrapingAnt.apiKey != null && config.scrapingAnt.apiKey.length > 0) {
|
||||
try {
|
||||
const result = await axios({
|
||||
url: `https://api.scrapingant.com/v1/usage?x-api-key=${config.scrapingAnt.apiKey}`,
|
||||
});
|
||||
scrapingAntData = result.data;
|
||||
} catch (Exception) {
|
||||
console.error('Could not query plan data from scraping ant.', Exception);
|
||||
}
|
||||
}
|
||||
|
||||
res.body = {
|
||||
interval: config.interval,
|
||||
lastRun: config.lastRun || null,
|
||||
scrapingAntData,
|
||||
};
|
||||
|
||||
res.send();
|
||||
|
||||
@@ -5,13 +5,13 @@ const hasher = require('../../services/security/hash');
|
||||
|
||||
loginRouter.get('/user', async (req, res) => {
|
||||
const currentUserId = req.session.currentUser;
|
||||
const isAdmin = currentUserId == null ? false : userStorage.getUser(currentUserId).isAdmin;
|
||||
if (currentUserId == null) {
|
||||
const currentUser = currentUserId == null ? null : userStorage.getUser(currentUserId);
|
||||
if (currentUser == null) {
|
||||
res.body = {};
|
||||
} else {
|
||||
res.body = {
|
||||
userId: currentUserId,
|
||||
isAdmin,
|
||||
userId: currentUser.id,
|
||||
isAdmin: currentUser.isAdmin,
|
||||
};
|
||||
}
|
||||
res.send();
|
||||
|
||||
52
lib/notification/adapter/mattermost.js
Normal file
52
lib/notification/adapter/mattermost.js
Normal file
@@ -0,0 +1,52 @@
|
||||
const { markdown2Html } = require('../../services/markdown');
|
||||
const { getJob } = require('../../services/storage/jobStorage');
|
||||
const axios = require('axios');
|
||||
|
||||
/**
|
||||
* sends new listings to mattermost
|
||||
* @param serviceName e.g immowelt
|
||||
* @param newListings an array with newly found listings
|
||||
* @param notificationConfig config of this notification adapter
|
||||
* @param jobKey name of the current job that is being executed
|
||||
* @returns {Promise<Void> | void}
|
||||
*/
|
||||
exports.send = ({ serviceName, newListings, notificationConfig, jobKey }) => {
|
||||
const { webhook, channel } = notificationConfig.find((adapter) => adapter.id === 'mattermost').fields;
|
||||
const job = getJob(jobKey);
|
||||
const jobName = job == null ? jobKey : job.name;
|
||||
|
||||
let message = `### *${jobName}* (${serviceName}) found **${newListings.length}** new listings:\n\n`;
|
||||
message += `| Title | Address | Size | Price |\n|:----|:----|:----|:----|\n`;
|
||||
message += newListings.map(
|
||||
(o) => `| [${o.title}](${o.link}) | ` + [o.address, o.size.replace(/2m/g, '$m^2$'), o.price].join(' | ') + ' |\n'
|
||||
);
|
||||
|
||||
return axios.post(`${webhook}`, {
|
||||
channel: channel,
|
||||
text: message,
|
||||
});
|
||||
};
|
||||
|
||||
/**
|
||||
* exported config is being used in the frontend to generate the fields
|
||||
* incoming values will be the keys (and values) of the fields
|
||||
*
|
||||
*/
|
||||
exports.config = {
|
||||
id: __filename.slice(__dirname.length + 1, -3),
|
||||
name: 'Mattermost',
|
||||
readme: markdown2Html('lib/notification/adapter/mattermost.md'),
|
||||
description: 'Fredy will send new listings to your mattermost team chat.',
|
||||
fields: {
|
||||
webhook: {
|
||||
type: 'text',
|
||||
label: 'Webhook-URL',
|
||||
description: 'The incoming webhook url',
|
||||
},
|
||||
channel: {
|
||||
type: 'text',
|
||||
label: 'Channel',
|
||||
description: 'The channel where fredy should send notifications to.',
|
||||
},
|
||||
},
|
||||
};
|
||||
5
lib/notification/adapter/mattermost.md
Normal file
5
lib/notification/adapter/mattermost.md
Normal file
@@ -0,0 +1,5 @@
|
||||
### Mattermost Adapter
|
||||
|
||||
For Mattermost, you need to create a incoming webhook. This is pretty easy. Please visit the steps in the [developer docs](https://docs.mattermost.com/developer/webhooks-incoming.html) and follow the instructions.
|
||||
|
||||
As a result, you get the webhook URL for configuration in fredy. In addition, the target channel must be defined.
|
||||
33
lib/notification/adapter/sqlite.js
Normal file
33
lib/notification/adapter/sqlite.js
Normal file
@@ -0,0 +1,33 @@
|
||||
const { markdown2Html } = require('../../services/markdown');
|
||||
const Database = require('better-sqlite3');
|
||||
|
||||
/**
|
||||
* Stores data in a sqlite db in order to use the search results for later analytics
|
||||
* @param serviceName e.g immowelt
|
||||
* @param newListings an array with newly found listings
|
||||
* @param jobKey name of the current job that is being executed
|
||||
*/
|
||||
exports.send = ({ serviceName, newListings, jobKey }) => {
|
||||
const db = new Database('db/listings.db');
|
||||
const fields = ['serviceName', 'jobKey', 'id', 'size', 'rooms', 'price', 'address', 'title', 'link', 'description'];
|
||||
db.prepare(`CREATE TABLE IF NOT EXISTS listing (${fields.join(' TEXT, ')} TEXT);`).run();
|
||||
const insert = db.prepare(`INSERT INTO listing (${fields.join(', ')}) VALUES (@${fields.join(', @')})`);
|
||||
newListings.map((listing) => {
|
||||
let insertListing = {};
|
||||
fields.map((field) => {
|
||||
insertListing[field] = listing[field];
|
||||
});
|
||||
insertListing.serviceName = serviceName;
|
||||
insertListing.jobKey = jobKey;
|
||||
insert.run(insertListing);
|
||||
});
|
||||
return Promise.resolve();
|
||||
};
|
||||
|
||||
exports.config = {
|
||||
id: __filename.slice(__dirname.length + 1, -3),
|
||||
name: 'Sqlite',
|
||||
description: 'This adapter stores listings in a local sqlite3 database.',
|
||||
config: {},
|
||||
readme: markdown2Html('lib/notification/adapter/sqlite.md'),
|
||||
};
|
||||
3
lib/notification/adapter/sqlite.md
Normal file
3
lib/notification/adapter/sqlite.md
Normal file
@@ -0,0 +1,3 @@
|
||||
### Sqlite Adapter
|
||||
|
||||
This adapter stores search results in an sqlite database in db/listings.db
|
||||
@@ -1,4 +1,5 @@
|
||||
const { markdown2Html } = require('../../services/markdown');
|
||||
const { getJob } = require('../../services/storage/jobStorage');
|
||||
const axios = require('axios');
|
||||
|
||||
/**
|
||||
@@ -19,23 +20,24 @@ const arrayChunks = (inputArray, perChunk) =>
|
||||
* @param serviceName e.g immowelt
|
||||
* @param newListings an array with newly found listings
|
||||
* @param notificationConfig config of this notification adapter
|
||||
* * @param jobKey name of the current job that is being executed
|
||||
* @param jobKey name of the current job that is being executed
|
||||
* @returns {Promise<Void> | void}
|
||||
*/
|
||||
exports.send = ({ serviceName, newListings, notificationConfig, jobKey }) => {
|
||||
const { token, chatId } = notificationConfig.find((adapter) => adapter.id === 'telegram').fields;
|
||||
const job = getJob(jobKey);
|
||||
const jobName = job == null ? jobKey : job.name;
|
||||
|
||||
//we have to split messages into chunk, because otherwise messages are going to become too big and will fail
|
||||
const chunks = arrayChunks(newListings, 3);
|
||||
|
||||
const promises = chunks.map((chunk) => {
|
||||
let message = `Job: ${jobKey} | Service <b>${serviceName}</b> found <b>${newListings.length}</b> new listings:\n\n`;
|
||||
let message = `<i>${jobName}</i> (${serviceName}) found <b>${newListings.length}</b> new listings:\n\n`;
|
||||
message += chunk.map(
|
||||
(o) =>
|
||||
`<b>${shorten(o.title.replace(/\*/g, ''), 45)}</b>\n` +
|
||||
`<a href="${o.link}"><b>${shorten(o.title.replace(/\*/g, ''), 45).trim()}</b></a>\n` +
|
||||
[o.address, o.price, o.size].join(' | ') +
|
||||
'\n' +
|
||||
`<a href="${o.link}">${o.link}</a>\n\n`
|
||||
'\n\n'
|
||||
);
|
||||
|
||||
return axios.post(`https://api.telegram.org/bot${token}/sendMessage`, {
|
||||
|
||||
@@ -6,7 +6,7 @@ function normalize(o) {
|
||||
const id = parseInt(o.id.substring(o.id.indexOf('_') + 1, o.id.length));
|
||||
const size = o.size != null ? o.size.replace('Wohnfläche ', '') : 'N/A m²';
|
||||
const price = o.price.replace('Kaufpreis ', '');
|
||||
const address = o.address.split(' • ')[1];
|
||||
const address = o.address.split(' • ')[o.address.split(' • ').length - 1];
|
||||
const title = o.title || 'No title available';
|
||||
//normally we would just read the link from the source, but immonet decided to trick user by adding a click listener instead of
|
||||
//a href to do some weird reporting. (Very user friendly for handicaped ppl... not)
|
||||
|
||||
52
lib/provider/immoswp.js
Executable file
52
lib/provider/immoswp.js
Executable file
@@ -0,0 +1,52 @@
|
||||
const utils = require('../utils');
|
||||
|
||||
let appliedBlackList = [];
|
||||
|
||||
function normalize(o) {
|
||||
const id = o.id.substring(o.id.indexOf('-') + 1, o.id.length);
|
||||
const size = o.size || 'N/A m²';
|
||||
const price = (o.price || '--- €').replace('Preis auf Anfrage', '--- €');
|
||||
const address = o.address || 'No address available';
|
||||
const title = o.title || 'No title available';
|
||||
const link = `https://immo.swp.de/immobilien/${id}`;
|
||||
const description = o.description;
|
||||
return Object.assign(o, { id, address, price, size, title, link, description });
|
||||
}
|
||||
|
||||
function applyBlacklist(o) {
|
||||
const titleNotBlacklisted = !utils.isOneOf(o.title, appliedBlackList);
|
||||
const descNotBlacklisted = !utils.isOneOf(o.description, appliedBlackList);
|
||||
|
||||
return titleNotBlacklisted && descNotBlacklisted;
|
||||
}
|
||||
|
||||
const config = {
|
||||
url: null,
|
||||
crawlContainer: '.js-serp-item',
|
||||
sortByDateParam: 's=most_recently_updated_first',
|
||||
crawlFields: {
|
||||
id: '@id',
|
||||
price: 'div.item__spec.item-spec-price | trim',
|
||||
size: 'div.item__spec.item-spec-area | trim',
|
||||
title: 'a.js-item-title-link@title',
|
||||
address: 'div.item__locality | removeNewline | trim',
|
||||
description: 'div.item__main-info-points.clearfix p small | removeNewline | trim',
|
||||
},
|
||||
paginate: 'li.page-item.pagination__item a.page-link@href',
|
||||
normalize: normalize,
|
||||
filter: applyBlacklist,
|
||||
};
|
||||
|
||||
exports.init = (sourceConfig, blacklist) => {
|
||||
config.enabled = sourceConfig.enabled;
|
||||
config.url = sourceConfig.url;
|
||||
appliedBlackList = blacklist || [];
|
||||
};
|
||||
|
||||
exports.metaInformation = {
|
||||
name: 'Immo Südwest Presse',
|
||||
baseUrl: 'https://immo.swp.de/',
|
||||
id: __filename.slice(__dirname.length + 1, -3),
|
||||
};
|
||||
|
||||
exports.config = config;
|
||||
@@ -7,30 +7,29 @@ function makeDriver(headers = {}) {
|
||||
let cookies = '';
|
||||
|
||||
return async function driver(context, callback) {
|
||||
const url = context.url;
|
||||
let result;
|
||||
try {
|
||||
result = await axios({
|
||||
const url = context.url;
|
||||
const result = await axios({
|
||||
url,
|
||||
headers: {
|
||||
...headers,
|
||||
Cookie: cookies,
|
||||
},
|
||||
});
|
||||
|
||||
if (typeof result.data === 'object' && url.toLowerCase().indexOf('scrapingant') !== -1) {
|
||||
//assume we have gotten a response from scrapingAnt
|
||||
if (cookies.length === 0) {
|
||||
cookies = result.data.cookies;
|
||||
}
|
||||
callback(null, result.data.content);
|
||||
} else {
|
||||
callback(null, result.data);
|
||||
}
|
||||
} catch (exception) {
|
||||
console.error(`Error while trying to scrape data. Received error: ${exception.message}`);
|
||||
callback(null, []);
|
||||
}
|
||||
|
||||
if (typeof result.data === 'object' && url.toLowerCase().indexOf('scrapingant') !== -1) {
|
||||
//assume we have gotten a response from scrapingAnt
|
||||
if (cookies.length === 0) {
|
||||
cookies = result.data.cookies;
|
||||
}
|
||||
callback(null, result.data.content);
|
||||
} else {
|
||||
callback(null, result.data);
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
const { metaInformation } = require('../provider/immoscout');
|
||||
//to better confure re-capture chose a random proxy each time we do a call
|
||||
//to better configure re-capture chose a random proxy each time we do a call
|
||||
const proxies = ['ae', 'br', 'cn', 'de', 'es', 'fr', 'gb', 'hk', 'in', 'it', 'il', 'jp', 'nl', 'ru', 'sa', 'us', 'cz'];
|
||||
const config = require('../../conf/config.json');
|
||||
|
||||
@@ -12,7 +12,9 @@ exports.transformUrlForScrapingAnt = (url, id) => {
|
||||
|
||||
if (isImmoscout(id)) {
|
||||
//only do calls to scrapingAnt when dealing with Immoscout
|
||||
url = `https://api.scrapingant.com/v1/general?url=${encodeURIComponent(url)}&proxy_country=${randomProxy}`;
|
||||
url = `https://api.scrapingant.com/v1/general?url=${encodeURIComponent(
|
||||
url
|
||||
)}&proxy_country=${randomProxy}&proxy_type=residential`;
|
||||
}
|
||||
return url;
|
||||
};
|
||||
|
||||
63
package.json
63
package.json
@@ -1,10 +1,10 @@
|
||||
{
|
||||
"name": "fredy",
|
||||
"version": "5.4.3",
|
||||
"version": "5.4.8",
|
||||
"description": "[F]ind [R]eal [E]states [d]amn eas[y].",
|
||||
"scripts": {
|
||||
"start": "node index.js",
|
||||
"dev": "yarn && export BUILD_DEV='true' && export NODE_ENV='development' && webpack-dev-server --progress --colors --watch --config ./webpack.dev.js",
|
||||
"dev": "yarn && export BUILD_DEV='true' && export NODE_ENV='development' && webpack serve --progress --color --config ./webpack.dev.js",
|
||||
"prod": "export BUILD_DEV='false' && webpack --node-env=production --config ./webpack.prod.js",
|
||||
"format": "prettier --write lib/**/*.js ui/src/**/*.js test/**/*.js *.js --single-quote --print-width 120",
|
||||
"test": "mocha --timeout 20000 test/**/*.test.js",
|
||||
@@ -43,8 +43,8 @@
|
||||
},
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=12.13.0",
|
||||
"npm": ">=6.0.0"
|
||||
"node": ">=14.0.0",
|
||||
"npm": ">=7.0.0"
|
||||
},
|
||||
"browserslist": [
|
||||
"> 0.5%",
|
||||
@@ -55,19 +55,20 @@
|
||||
"dependencies": {
|
||||
"@rematch/core": "2.2.0",
|
||||
"@rematch/loading": "2.1.2",
|
||||
"@sendgrid/mail": "7.6.0",
|
||||
"axios": "0.24.0",
|
||||
"@sendgrid/mail": "7.6.2",
|
||||
"axios": "0.26.1",
|
||||
"axios-retry": "^3.2.4",
|
||||
"body-parser": "1.19.0",
|
||||
"cookie-session": "1.4.0",
|
||||
"better-sqlite3": "^7.5.0",
|
||||
"body-parser": "1.19.2",
|
||||
"cookie-session": "2.0.0",
|
||||
"handlebars": "4.7.7",
|
||||
"highcharts": "9.3.1",
|
||||
"highcharts": "10.0.0",
|
||||
"highcharts-react-official": "3.1.0",
|
||||
"lowdb": "1.0.0",
|
||||
"markdown": "^0.5.0",
|
||||
"nanoid": "3.1.30",
|
||||
"node-mailjet": "3.3.4",
|
||||
"query-string": "^7.0.1",
|
||||
"nanoid": "3.3.1",
|
||||
"node-mailjet": "3.3.7",
|
||||
"query-string": "7.1.1",
|
||||
"react": "17.0.2",
|
||||
"react-dom": "17.0.2",
|
||||
"react-redux": "7.2.6",
|
||||
@@ -75,41 +76,41 @@
|
||||
"react-router-dom": "5.3.0",
|
||||
"react-switch": "^6.0.0",
|
||||
"redux": "4.1.2",
|
||||
"redux-thunk": "2.4.0",
|
||||
"restana": "4.9.2",
|
||||
"semantic-ui-react": "2.0.4",
|
||||
"serve-static": "^1.14.1",
|
||||
"redux-thunk": "2.4.1",
|
||||
"restana": "4.9.3",
|
||||
"semantic-ui-react": "2.1.2",
|
||||
"serve-static": "1.15.0",
|
||||
"slack": "11.0.2",
|
||||
"string-similarity": "^4.0.4",
|
||||
"x-ray": "2.3.4"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@babel/core": "7.16.0",
|
||||
"@babel/preset-env": "7.16.4",
|
||||
"@babel/preset-react": "7.16.0",
|
||||
"@babel/core": "7.17.8",
|
||||
"@babel/preset-env": "7.16.11",
|
||||
"@babel/preset-react": "7.16.7",
|
||||
"babel-eslint": "10.1.0",
|
||||
"babel-loader": "8.2.3",
|
||||
"chai": "4.3.4",
|
||||
"babel-loader": "8.2.4",
|
||||
"chai": "4.3.6",
|
||||
"clean-webpack-plugin": "4.0.0",
|
||||
"copy-webpack-plugin": "10.0.0",
|
||||
"css-loader": "6.5.1",
|
||||
"copy-webpack-plugin": "10.2.4",
|
||||
"css-loader": "6.7.1",
|
||||
"eslint": "7.32.0",
|
||||
"eslint-config-prettier": "8.3.0",
|
||||
"eslint-plugin-react": "7.27.1",
|
||||
"eslint-config-prettier": "8.5.0",
|
||||
"eslint-plugin-react": "7.29.3",
|
||||
"file-loader": "6.2.0",
|
||||
"history": "5.1.0",
|
||||
"history": "5.3.0",
|
||||
"husky": "4.3.8",
|
||||
"less": "4.1.2",
|
||||
"less-loader": "10.2.0",
|
||||
"lint-staged": "12.1.2",
|
||||
"mocha": "9.1.3",
|
||||
"prettier": "2.5.0",
|
||||
"lint-staged": "12.3.7",
|
||||
"mocha": "9.2.2",
|
||||
"prettier": "2.6.1",
|
||||
"proxyquire": "2.1.3",
|
||||
"redux-logger": "3.0.6",
|
||||
"style-loader": "3.3.1",
|
||||
"url-loader": "4.1.1",
|
||||
"webpack": "5.64.4",
|
||||
"webpack-cli": "4.9.1",
|
||||
"webpack": "5.70.0",
|
||||
"webpack-cli": "4.9.2",
|
||||
"webpack-dev-server": "3.11.2",
|
||||
"webpack-merge": "5.8.0"
|
||||
}
|
||||
|
||||
51
test/provider/immoswp.test.js
Normal file
51
test/provider/immoswp.test.js
Normal file
@@ -0,0 +1,51 @@
|
||||
const similarityCache = require('../../lib/services/similarity-check/similarityCache');
|
||||
const mockNotification = require('../mocks/mockNotification');
|
||||
const providerConfig = require('./testProvider.json');
|
||||
const mockStore = require('../mocks/mockStore');
|
||||
const proxyquire = require('proxyquire').noCallThru();
|
||||
const expect = require('chai').expect;
|
||||
const provider = require('../../lib/provider/immoswp');
|
||||
|
||||
describe('#immoswp testsuite()', () => {
|
||||
after(() => {
|
||||
similarityCache.stopCacheCleanup();
|
||||
});
|
||||
|
||||
provider.init(providerConfig.immoswp, [], []);
|
||||
const Fredy = proxyquire('../../lib/FredyRuntime', {
|
||||
'./services/storage/listingsStorage': {
|
||||
...mockStore,
|
||||
},
|
||||
'./notification/notify': mockNotification,
|
||||
});
|
||||
|
||||
it('should test immoswp provider', async () => {
|
||||
return await new Promise((resolve) => {
|
||||
const fredy = new Fredy(provider.config, null, provider.metaInformation.id, 'test1', similarityCache);
|
||||
fredy.execute().then((listing) => {
|
||||
expect(listing).to.be.a('array');
|
||||
|
||||
const notificationObj = mockNotification.get();
|
||||
expect(notificationObj).to.be.a('object');
|
||||
expect(notificationObj.serviceName).to.equal('immoswp');
|
||||
|
||||
notificationObj.payload.forEach((notify) => {
|
||||
/** check the actual structure **/
|
||||
expect(notify.id).to.be.a('string');
|
||||
expect(notify.price).to.be.a('string');
|
||||
expect(notify.size).to.be.a('string');
|
||||
expect(notify.title).to.be.a('string');
|
||||
expect(notify.link).to.be.a('string');
|
||||
expect(notify.address).to.be.a('string');
|
||||
|
||||
/** check the values if possible **/
|
||||
expect(notify.price).that.does.include('€');
|
||||
expect(notify.title).to.be.not.empty;
|
||||
expect(notify.link).that.does.include('https://immo.swp.de');
|
||||
expect(notify.address).to.be.not.empty;
|
||||
});
|
||||
resolve();
|
||||
});
|
||||
});
|
||||
});
|
||||
});
|
||||
@@ -32,13 +32,13 @@ describe('#immowelt testsuite()', () => {
|
||||
/** check the actual structure **/
|
||||
expect(notify.id).to.be.a('string');
|
||||
expect(notify.price).to.be.a('string');
|
||||
expect(notify.size).to.be.a('string');
|
||||
expect(notify.title).to.be.a('string');
|
||||
|
||||
expect(notify.link).to.be.a('string');
|
||||
expect(notify.address).to.be.a('string');
|
||||
|
||||
/** check the values if possible **/
|
||||
if (notify.size.trim().toLowerCase() !== 'k.a.') {
|
||||
if (notify.size != null && notify.size.trim().toLowerCase() !== 'k.a.') {
|
||||
expect(notify.size).that.does.include('m²');
|
||||
}
|
||||
expect(notify.title).to.be.not.empty;
|
||||
|
||||
@@ -16,6 +16,10 @@
|
||||
"url": "https://www.immobilienscout24.de/Suche/de/nordrhein-westfalen/duesseldorf/wohnung-mieten?enteredFrom=one_step_search",
|
||||
"enabled": true
|
||||
},
|
||||
"immoswp": {
|
||||
"url": "https://immo.swp.de/suchergebnisse?l=M%C3%BCnchen&r=0km&_multiselect_r=0km&ut=private&t=apartment%3Arental&a=de.muenchen&pf=&pt=&rf=0&rt=0&sf=50&st=&yf=&yt=&ff=&ft=&s=most_recently_updated_first&pa=&o=&ad=&u=",
|
||||
"enabled": true
|
||||
},
|
||||
"kalaydo": {
|
||||
"url": "https://www.kalaydo.de/immobilien/eigentumswohnung-kaufen/o/duesseldorf/4/?attr_gt_estate_size_living_area=90.0&attr_gt_no_of_rooms=3.5&maxPrice=420000.00&radius=5&resultsPerPage=50&sorting=-date",
|
||||
"enabled": true
|
||||
|
||||
@@ -12,10 +12,6 @@ const emptyTable = () => {
|
||||
);
|
||||
};
|
||||
|
||||
const truncate = (str, n) => {
|
||||
return str.length > n ? str.substr(0, n - 1) + '…' : str;
|
||||
};
|
||||
|
||||
const content = (providerData, onRemove) => {
|
||||
return (
|
||||
<Fragment>
|
||||
@@ -23,7 +19,11 @@ const content = (providerData, onRemove) => {
|
||||
return (
|
||||
<Table.Row key={data.id}>
|
||||
<Table.Cell>{data.name}</Table.Cell>
|
||||
<Table.Cell>{truncate(data.url, 60)}</Table.Cell>
|
||||
<Table.Cell>
|
||||
<a href={data.url} target="_blank" rel="noopener noreferrer">
|
||||
Visit site
|
||||
</a>
|
||||
</Table.Cell>
|
||||
<Table.Cell>
|
||||
<div style={{ float: 'right' }}>
|
||||
<Button circular color="red" icon="trash" onClick={() => onRemove(data.id)} />
|
||||
|
||||
@@ -1,25 +1,46 @@
|
||||
import React from 'react';
|
||||
import { format } from '../../services/time/timeService';
|
||||
import { Label } from 'semantic-ui-react';
|
||||
import { Header, Label, Message, Segment } from 'semantic-ui-react';
|
||||
|
||||
export default function ProcessingTimes({ processingTimes }) {
|
||||
return (
|
||||
<React.Fragment>
|
||||
<Label as="span" color="black">
|
||||
Processing Interval:
|
||||
<Label.Detail>{processingTimes.interval} min</Label.Detail>
|
||||
</Label>
|
||||
{processingTimes.lastRun && (
|
||||
<React.Fragment>
|
||||
<Label as="span" color="black">
|
||||
Last run:
|
||||
<Label.Detail>{format(processingTimes.lastRun)}</Label.Detail>
|
||||
</Label>
|
||||
<Label as="span" color="black">
|
||||
Next run:
|
||||
<Label.Detail>{format(processingTimes.lastRun + processingTimes.interval * 60000)}</Label.Detail>
|
||||
</Label>
|
||||
</React.Fragment>
|
||||
<div>
|
||||
<Label as="span" color="black">
|
||||
Processing Interval:
|
||||
<Label.Detail>{processingTimes.interval} min</Label.Detail>
|
||||
</Label>
|
||||
{processingTimes.lastRun && (
|
||||
<React.Fragment>
|
||||
<Label as="span" color="black">
|
||||
Last run:
|
||||
<Label.Detail>{format(processingTimes.lastRun)}</Label.Detail>
|
||||
</Label>
|
||||
<Label as="span" color="black">
|
||||
Next run:
|
||||
<Label.Detail>{format(processingTimes.lastRun + processingTimes.interval * 60000)}</Label.Detail>
|
||||
</Label>
|
||||
</React.Fragment>
|
||||
)}
|
||||
</div>
|
||||
{processingTimes.scrapingAntData != null && (
|
||||
<Segment inverted>
|
||||
<Header as="h5">Remaining ScrapingAnt calls</Header>
|
||||
<Message.List>
|
||||
<Message.Item>Plan: {processingTimes.scrapingAntData.plan_name}</Message.Item>
|
||||
<Message.Item>
|
||||
Duration: {format(new Date(processingTimes.scrapingAntData.start_date))} -{' '}
|
||||
{format(new Date(processingTimes.scrapingAntData.end_date))}
|
||||
</Message.Item>
|
||||
<Message.Item>
|
||||
Credits: {processingTimes.scrapingAntData.remained_credits}/
|
||||
{processingTimes.scrapingAntData.plan_total_credits} (250 credits per call)
|
||||
</Message.Item>
|
||||
</Message.List>
|
||||
If you want to scrape Immoscout more often, you have to purchase a premium account of ScrapingAnt. You can use
|
||||
the code <b>FREDY10</b> to get 10% off. (No affiliation, we are <b>not</b> getting paid to recommend
|
||||
ScrapingAnt.
|
||||
</Segment>
|
||||
)}
|
||||
</React.Fragment>
|
||||
);
|
||||
|
||||
Reference in New Issue
Block a user