Rewrite of generate-environment-identifiers-dict.sh
#1028 opened on May 30, 2024
Description
Describe the feature request: The current script uses majestic's domain list, which may be missing a lot more domains as compared to the other lists in the checklist. Another issue that I find with it is that it makes a new (insecure) ssl session for every domain in the list, which is both insecure and inefficient.
More lists recommendations would be appreciated as these lists may be outdated.
Additional context:
You can use this command to interact with the sql server directly
psql -h crt.sh -p 5432 -U guest certwatch
https://groups.google.com/g/crtsh/c/sUmV0mBz8bQ/m/K-6Vymd_AAAJ
Domain list https://hackertarget.com/top-million-site-list-download/ https://radar.cloudflare.com/domains https://www.domcop.com/top-10-million-websites https://s3-us-west-1.amazonaws.com/umbrella-static/index.html https://majestic.com/reports/majestic-million https://builtwith.com/top-sites https://tranco-list.eu/ https://statvoo.com/dl/top-1million-sites.csv.zip
Next steps:
- Implement a script that pulls domains from domcorp, alexa, cloudflare, majestic and others
- Dedupe the list/find better ways to extract environment ids
- Change to use sql interface
- I intend to open a pull request later