Skip to content

encodeous/dirtyproxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DirtyProxy

Nuget Url

A quick and easy proxy scraper!

That does not mean it lacks features ;)

Features

  • High-Performance Asynchronous Scraping
  • Highly Configurable
  • Comes with proxy checker
  • Comes with a default list of proxy sites
  • Lightweight (~6% CPU usage on a 4-core i7)
  • Filter unique proxies

Usage

Please note, enabling proxy checking (on by default) will take MUCH longer!

Using Default Parameters

var scraper = new ProxyScraper(ProxyScraper.DefaultList);
var proxies = await scraper.ScrapeAsync();

await File.WriteAllLinesAsync("validProxies.txt", proxies.ValidProxies.Select(x=>x.ToString()));
await File.WriteAllLinesAsync("validSources.txt", proxies.ValidSources.Select(x=>x.Trim()));
await File.WriteAllLinesAsync("allProxies.txt", proxies.Proxies.Select(x=>x.ToString()));

Using Custom Proxy Source List

var sources = new[]
{
    "https://source.proxy.list",
    "https://other.source.proxy.list"
};
// You can use your own list, or the list included by default!
var scraper = new ProxyScraper(sources);

...

Using Custom Proxy Checker

var scraper = new ProxyScraper(ProxyScraper.DefaultList, async proxy =>
{
    try
    {
        var wc = new WebClient();
        wc.Proxy = new WebProxy(proxy.ToString());
        wc.Headers[HttpRequestHeader.UserAgent] = ProxyScraper.DefaultAgent;
        // timeout in 10 seconds
        var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));
        cts.Token.Register(wc.CancelAsync);
        await wc.OpenReadTaskAsync("https://google.com");
        cts.Dispose();
        return true;
    }
    catch
    {
        return false;
    }
});

...

Using Custom User Agent

// You can use any user agent you want!
var scraper = new ProxyScraper(ProxyScraper.DefaultList, "Your user agent");
var proxies = await scraper.ScrapeAsync();

...

Using Custom Request Timeouts

var scraper = new ProxyScraper(ProxyScraper.DefaultList, checkTimeout: 5, scrapeTimeout: 2);
var proxies = await scraper.ScrapeAsync();

...

Fast Scraping (Without proxy validation)

// Disable proxy checking
var scraper = new ProxyScraper(ProxyScraper.DefaultList, checkProxies: false);
var proxies = await scraper.ScrapeAsync();

...

Custom Proxy Check URL

// Make sure the proxies can successfully connect to a url
var scraper = new ProxyScraper(ProxyScraper.DefaultList, checkUrl: "https://google.ca");
var proxies = await scraper.ScrapeAsync();

...

Misc Configuration

// number of tasks for proxy checking (mainly waiting)
ProxyScraper.CheckTasks = 300;

About

A quick and dirty proxy scraper.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages