Skip to content

API Reference

unparallel

unparallel.RequestError dataclass

A dataclass wrapping an exception that was raised during a web request.

Besides the exception itself, this contains the URL, method, and (optional) payload of the failed request.

ATTRIBUTE DESCRIPTION
url

The target URL of the request.

TYPE: str

method

The HTTP method.

TYPE: str

payload

The payload/body of the request.

TYPE: Optional[Any]

exception

(Exception): The exception that was raised.

TYPE: Exception

Source code in unparallel/unparallel.py
@dataclass
class RequestError:
    """A dataclass wrapping an exception that was raised during a web request.

    Besides the exception itself, this contains the URL, method, and (optional) payload
    of the failed request.

    Attributes:
        url (str): The target URL of the request.
        method (str): The HTTP method.
        payload (Optional[Any]): The payload/body of the request.
        exception: (Exception): The exception that was raised.
    """

    url: str
    method: str
    payload: Optional[Any]
    exception: Exception

unparallel.up async

up(urls, method='GET', base_url=None, headers=None, payloads=None, response_fn=DEFAULT_JSON_FN, flatten_result=False, max_connections=100, timeout=10, max_retries_on_timeout=3, raise_for_status=True, limits=None, timeouts=None, client=None, progress=True, semaphore_value=USE_MAX_CONNECTIONS)

Creates async web requests to the specified URL(s) using asyncio and httpx.

PARAMETER DESCRIPTION
urls

A list of URLs as the targets for the requests. If only one URL but multiple payloads are supplied, that URL is used for all requests. If a base_url is supplied, this can also be a list of paths (or one path).

TYPE: Union[str, List[str]]

method

HTTP method to use - one of GET, OPTIONS, HEAD, POST, PUT, PATCH, or DELETE. Defaults to GET.

TYPE: str DEFAULT: 'GET'

base_url

The base URL of the target API/service. Defaults to None.

TYPE: Optional[str] DEFAULT: None

headers

A dictionary of headers to use. Defaults to None.

TYPE: Optional[Dict[str, Any]] DEFAULT: None

payloads

A list of JSON payloads (dictionaries) e.g. for HTTP post requests. Used together with urls. If one payload but multiple URLs are supplied, that payload is used for all requests. Defaults to None.

TYPE: Optional[Any] DEFAULT: None

response_fn

The function (callback) to apply on every response of the HTTP requests. This can be an existing function of httpx.Response like .json() or .read(), or a custom function which takes the httpx.Response as the argument returns Any. If you set this to None, you will get the raw httpx.Response. Defaults to httpx.Response.json.

TYPE: Optional[Callable[[Response], Any]] DEFAULT: DEFAULT_JSON_FN

flatten_result

If True and the response per request is a list, flatten that list of lists. This is useful when using paging. Defaults to False.

TYPE: bool DEFAULT: False

max_connections

The total number of simultaneous TCP connections. Defaults to 100. This is passed into httpx.Limits.

TYPE: int DEFAULT: 100

timeout

The timeout for requests in seconds. Defaults to 10. This is passed into httpx.Timeout.

TYPE: int DEFAULT: 10

max_retries_on_timeout

The maximum number retries if the requests fails due to a timeout (httpx.TimeoutException). Defaults to 3.

TYPE: int DEFAULT: 3

raise_for_status

If True, .raise_for_status() is called on overy response.

TYPE: bool DEFAULT: True

limits

The limits configuration for httpx. If specified, this overrides the max_connections parameter.

TYPE: Optional[Limits] DEFAULT: None

timeouts

The timeout configuration for httpx. If specified, this overrides the timeout parameter.

TYPE: Optional[Timeout] DEFAULT: None

client

An instance of httpx.AsyncClient to be used for creating the HTTP requests. Note that if you pass a client, all other options that parametrize the client (base_url, headers, limits, and timeouts) are ignored. Defaults to None.

TYPE: Optional[AsyncClient] DEFAULT: None

progress

If set to True, progress bar is shown. Defaults to True.

TYPE: bool DEFAULT: True

semaphore_value

(Union[int, UseMaxConnections, None]): The value for the asyncio.Semaphore object that syncronizes the calls to HTTPX. Defaults to the number of max_connections.

TYPE: Union[int, UseMaxConnections, None] DEFAULT: USE_MAX_CONNECTIONS

RAISES DESCRIPTION
ValueError

If the HTTP method is not valid.

ValueError

If the number of URLs provided does not match the number of payloads (except if there is only one URL).

RETURNS DESCRIPTION
List[Any]

List[Any]: A list of the response data per request in the same order as the

List[Any]

input (URLs/payloads).

Source code in unparallel/unparallel.py
async def up(
    urls: Union[str, List[str]],
    method: str = "GET",
    base_url: Optional[str] = None,
    headers: Optional[Dict[str, Any]] = None,
    payloads: Optional[Any] = None,
    response_fn: Optional[Callable[[httpx.Response], Any]] = DEFAULT_JSON_FN,
    flatten_result: bool = False,
    max_connections: Optional[int] = 100,
    timeout: Optional[int] = 10,
    max_retries_on_timeout: int = 3,
    raise_for_status: bool = True,
    limits: Optional[httpx.Limits] = None,
    timeouts: Optional[httpx.Timeout] = None,
    client: Optional[httpx.AsyncClient] = None,
    progress: bool = True,
    semaphore_value: Union[int, UseMaxConnections, None] = USE_MAX_CONNECTIONS,
) -> List[Any]:
    """Creates async web requests to the specified URL(s) using ``asyncio``
    and ``httpx``.

    Args:
        urls (Union[str, List[str]]): A list of URLs as the targets for the requests.
            If only one URL but multiple payloads are supplied, that URL is used for
            all requests.
            If a ``base_url`` is supplied, this can also be a list of paths
            (or one path).
        method (str): HTTP method to use - one of ``GET``, ``OPTIONS``, ``HEAD``,
            ``POST``, ``PUT``, ``PATCH``, or ``DELETE``. Defaults to ``GET``.
        base_url (Optional[str]):  The base URL of the target API/service. Defaults to
            ``None``.
        headers (Optional[Dict[str, Any]], optional): A dictionary of headers to use.
            Defaults to ``None``.
        payloads (Optional[Any], optional): A list of JSON payloads (dictionaries) e.g.
            for HTTP post requests. Used together with ``urls``. If one payload but
            multiple URLs are supplied, that payload is used for all requests.
            Defaults to ``None``.
        response_fn (Optional[Callable[[httpx.Response], Any]]): The function (callback)
            to apply on every response of the HTTP requests. This can be an existing
            function of ``httpx.Response`` like ``.json()`` or ``.read()``, or a custom
            function which takes the ``httpx.Response`` as the argument returns ``Any``.
            If you set this to ``None``, you will get the raw ``httpx.Response``.
            Defaults to ``httpx.Response.json``.
        flatten_result (bool): If True and the response per request is a list,
            flatten that list of lists. This is useful when using paging.
            Defaults to ``False``.
        max_connections (int): The total number of simultaneous TCP
            connections. Defaults to ``100``. This is passed into ``httpx.Limits``.
        timeout (int): The timeout for requests in seconds. Defaults to 10.
            This is passed into ``httpx.Timeout``.
        max_retries_on_timeout (int): The maximum number retries if the requests fails
            due to a timeout (``httpx.TimeoutException``). Defaults to ``3``.
        raise_for_status (bool): If True, ``.raise_for_status()`` is called on overy
            response.
        limits (Optional[httpx.Limits]): The limits configuration for ``httpx``.
            If specified, this overrides the ``max_connections`` parameter.
        timeouts (Optional[httpx.Timeout]): The timeout configuration for ``httpx``.
            If specified, this overrides the ``timeout`` parameter.
        client (Optional[httpx.AsyncClient]): An instance of ``httpx.AsyncClient`` to be
            used for creating the HTTP requests. **Note that if you pass a client, all
            other options that parametrize the client (``base_url``, ``headers``,
            ``limits``, and ``timeouts``) are ignored**. Defaults to ``None``.
        progress (bool): If set to ``True``, progress bar is shown.
            Defaults to ``True``.
        semaphore_value: (Union[int, UseMaxConnections, None]): The value for the
            ``asyncio.Semaphore`` object that syncronizes the calls to HTTPX. Defaults
            to the number of ``max_connections``.

    Raises:
        ValueError: If the HTTP method is not valid.
        ValueError: If the number of URLs provided does not match the number of
            payloads (except if there is only one URL).

    Returns:
        List[Any]:  A list of the response data per request in the same order as the
        input (URLs/payloads).
    """
    # Check if method it valid
    if method.upper() not in VALID_HTTP_METHODS:
        raise ValueError(
            f"The method '{method}' is not a supported HTTP method. "
            f"Supported methods: {VALID_HTTP_METHODS}"
        )

    # Wrap single URL into list to check for alignment with payload
    if isinstance(urls, str):
        urls = [urls]

    # Check if payloads align with URLs
    if payloads:
        if not isinstance(payloads, list):
            payloads = [payloads]
        if len(urls) == 1 and len(payloads) > 1:
            logging.info(f"Using URL '{urls[0]}' for all {len(payloads)} payloads")
            urls = urls * len(payloads)
        if len(payloads) == 1 and len(urls) > 1:
            logging.info(f"Using payload '{payloads[0]}' for all {len(urls)} URLs")
            payloads = payloads * len(urls)
        if len(urls) != len(payloads):
            raise ValueError(
                f"The number of URLs does not match the number of payloads: "
                f"{len(urls)} != {len(payloads)}"
            )

    if timeouts is None:
        timeouts = httpx.Timeout(timeout)
    if limits is None:
        if max_connections != DEFAULT_LIMITS.max_connections:
            limits = httpx.Limits(max_connections=max_connections)
            limits.max_keepalive_connections = DEFAULT_LIMITS.max_keepalive_connections
        else:
            limits = DEFAULT_LIMITS

    # After some benchmarking we discovered that syncronizing the HTTP requests with a
    # semaphore object that has the same value as the max_connections gives the best
    # performance.
    # Also, limiting the semaphore value to a maximum of 1k drastically reduced the
    # amount of timeouts.
    if isinstance(semaphore_value, UseMaxConnections):
        semaphore_value = min(
            max_connections or MAX_SEMAPHORE_COUNT, MAX_SEMAPHORE_COUNT
        )

    return await request_urls(
        urls=urls,
        method=method,
        base_url=base_url,
        headers=headers,
        payloads=payloads,
        response_fn=response_fn,
        flatten_result=flatten_result,
        max_retries_on_timeout=max_retries_on_timeout,
        raise_for_status=raise_for_status,
        limits=limits,
        timeouts=timeouts,
        client=client,
        progress=progress,
        semaphore_value=semaphore_value,
    )