proxymesh
diff --git a/‎docs/autoscraper.rst‎
Lines changed: 161 additions & 0 deletions b/‎docs/autoscraper.rst‎
Lines changed: 161 additions & 0 deletions
diff --git a/‎docs/index.rst‎
Lines changed: 2 additions & 0 deletions b/‎docs/index.rst‎
Lines changed: 2 additions & 0 deletions
@@ -0,0 +1,161 @@
+AutoScraper
+===========
+
+The ``autoscraper_proxy`` module provides proxy header support for AutoScraper.
+
+Installation
+------------
+
+First, install AutoScraper::
+
+    pip install autoscraper
+
+Then you can use the proxy header extension.
+
+Usage
+-----
+
+Basic Usage
+~~~~~~~~~~~
+
+The ``ProxyAutoScraper`` class is a drop-in replacement for ``AutoScraper`` 
+that adds proxy header capabilities:
+
+.. code-block:: python
+
+    from python_proxy_headers.autoscraper_proxy import ProxyAutoScraper
+
+    # Create a scraper with proxy headers
+    scraper = ProxyAutoScraper(proxy_headers={'X-ProxyMesh-Country': 'US'})
+
+    # Build rules from a sample page
+    result = scraper.build(
+        url='https://finance.yahoo.com/quote/AAPL/',
+        wanted_list=['Apple Inc.'],
+        request_args={'proxies': {'https': 'http://proxy.example.com:8080'}}
+    )
+
+    print(result)
+
+Using Learned Rules
+~~~~~~~~~~~~~~~~~~~
+
+Once you've built rules, you can use them on other pages:
+
+.. code-block:: python
+
+    from python_proxy_headers.autoscraper_proxy import ProxyAutoScraper
+
+    scraper = ProxyAutoScraper(proxy_headers={'X-ProxyMesh-Country': 'US'})
+
+    # Build rules
+    scraper.build(
+        url='https://finance.yahoo.com/quote/AAPL/',
+        wanted_list=['Apple Inc.'],
+        request_args={'proxies': {'https': 'http://proxy:8080'}}
+    )
+
+    # Use rules on another page
+    result = scraper.get_result_similar(
+        url='https://finance.yahoo.com/quote/GOOG/',
+        request_args={'proxies': {'https': 'http://proxy:8080'}}
+    )
+
+    print(result)  # ['Alphabet Inc.']
+
+Saving and Loading Rules
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+You can save and load learned rules:
+
+.. code-block:: python
+
+    scraper = ProxyAutoScraper(proxy_headers={'X-ProxyMesh-Country': 'US'})
+
+    # Build and save rules
+    scraper.build(url='...', wanted_list=['...'])
+    scraper.save('my_rules.json')
+
+    # Later, load rules
+    scraper2 = ProxyAutoScraper(proxy_headers={'X-ProxyMesh-Country': 'UK'})
+    scraper2.load('my_rules.json')
+
+Context Manager
+~~~~~~~~~~~~~~~
+
+Use as a context manager to ensure proper cleanup:
+
+.. code-block:: python
+
+    with ProxyAutoScraper(proxy_headers={'X-Custom': 'value'}) as scraper:
+        result = scraper.build(
+            url='https://example.com',
+            wanted_list=['Example Domain'],
+            request_args={'proxies': {'https': 'http://proxy:8080'}}
+        )
+
+Updating Proxy Headers
+~~~~~~~~~~~~~~~~~~~~~~
+
+You can update proxy headers at runtime:
+
+.. code-block:: python
+
+    scraper = ProxyAutoScraper(proxy_headers={'X-Country': 'US'})
+
+    # Make some requests...
+
+    # Change proxy headers
+    scraper.set_proxy_headers({'X-Country': 'UK'})
+
+    # Subsequent requests use new headers
+
+API Reference
+-------------
+
+ProxyAutoScraper Class
+~~~~~~~~~~~~~~~~~~~~~~
+
+.. py:class:: ProxyAutoScraper(proxy_headers=None, stack_list=None)
+
+    AutoScraper subclass with proxy header support.
+
+    Inherits all methods from ``autoscraper.AutoScraper``.
+
+    :param proxy_headers: Dict of headers to send to proxy servers
+    :param stack_list: Initial stack list (rules) for the scraper
+
+    .. py:method:: set_proxy_headers(proxy_headers)
+
+        Update the proxy headers. Creates a new session on next request.
+
+        :param proxy_headers: New proxy headers to use
+
+    .. py:method:: close()
+
+        Close the underlying session.
+
+    .. py:method:: build(url=None, wanted_list=None, wanted_dict=None, html=None, request_args=None, update=False, text_fuzz_ratio=1.0)
+
+        Build scraping rules with proxy header support.
+
+        :param url: URL of the target web page
+        :param wanted_list: List of needed contents to be scraped
+        :param wanted_dict: Dict of needed contents (keys are aliases)
+        :param html: HTML string (alternative to URL)
+        :param request_args: Request arguments including proxies
+        :param update: If True, add to existing rules
+        :param text_fuzz_ratio: Fuzziness ratio for matching
+        :returns: List of similar results
+
+    .. py:method:: get_result_similar(url=None, html=None, soup=None, request_args=None, ...)
+
+        Get similar results with proxy header support.
+
+    .. py:method:: get_result_exact(url=None, html=None, soup=None, request_args=None, ...)
+
+        Get exact results with proxy header support.
+
+    .. py:method:: get_result(url=None, html=None, request_args=None, ...)
+
+        Get both similar and exact results with proxy header support.
@@ -14,6 +14,7 @@ We currently provide extensions to the following packages:
 * :doc:`requests <requests>` - Simple HTTP library for Python
 * :doc:`aiohttp <aiohttp>` - Async HTTP client/server framework
 * :doc:`httpx <httpx>` - Modern HTTP client library
+* :doc:`autoscraper <autoscraper>` - Smart automatic web scraper
 
 Purpose
 -------
@@ -50,6 +51,7 @@ Contents
    requests
    aiohttp
    httpx
+   autoscraper
 
 Indices and tables
 ==================