Building a Threat Intelligence Feed using the Twitter API and a bit of code
The InfoSec community is highly active on Twitter. The platform is, among other things, used is for sharing malware and phishing URLs through the #opendir hashtag. Very useful for correlating with DNS traffic, but not often used due to the difficulty of interpreting Twitter feeds in an automated manner. This post goes into how Twitter can be used as a threat intelligence feed.
I’ve written a small program that receives a stream of tweets containing URLs using ‘hxxp://’ and ‘hxxps://’. This is the format malicious URLs are often shared in to prevent users from accidentally clicking on them. The program retrieves these tweets, parses the Twitter response and extracts the malicious URLs. The feed is available at twitter.threatintel.rocks and is made to be interpreted by machines.
A few examples
Pretty printing the feed using the JQ tool, a JSON command-line processor, allows us the more clearly view the results.
curl https://twitter.threatintel.rocks/ --silent | jq
Retrieving only the reported URLs
JQ also allows us to apply filters on the retrieved content. A simple cURL command with a filter on the ‘malicious_urls’ array shows us all the reported URLs transformed back into their original form.
curl https://twitter.threatintel.rocks/ --silent | jq -r '.malicious_urls | .[]'
Generating a list of IP addresses from the most reported URLs
The following command allows you to extract all values from the malicious_ips array and sort it on occurrence.
curl https://twitter.threatintel.rocks/ --silent | jq '.malicious_ips | .[]?' -r | sort | uniq -c | sort -nr
Generating a list of top contributors of malicious URLs.
This only contains contributors that have been active since the launch of the feed. However, it’s continuously updated.
curl https://twitter.threatintel.rocks/ --silent | jq -r .username | sort | uniq -c | sort -nr
Most SIEM tools have the ability to retrieve threat intelligence feeds over REST and parse JSON. So, I’ve made the feed to represent the data in the JSONL format as it seemed the most logical format for now.
Have fun!