1. Alexa top 1,000,000 sites. http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
2. parse_url. php.net (2021)
3. Uniform resource identifier (uri) schemes. IANA (2021)
4. Urlhaus. abuse.ch (2021)
5. Google safe browsing (2022). https://safebrowsing.google.com/