# Semlib > Semlib: A Python library for building data processing and analysis pipelines with LLMs. Semlib is a Python library for building data processing and data analysis pipelines that leverage the power of large language models (LLMs). Semlib provides, as building blocks, familiar functional programming primitives like map, reduce, sort, and filter, but with a twist: Semlib's implementation of these operations are programmed with natural language descriptions rather than code. Under the hood, Semlib handles complexities such as prompting, parsing, concurrency control, caching, and cost tracking. # Home # Semlib Semlib is a Python library for building data processing and data analysis pipelines that leverage the power of large language models (LLMs). Semlib provides, as building blocks, familiar functional programming primitives like map, reduce, sort, and filter, but with a twist: Semlib's implementation of these operations are **programmed with natural language descriptions** rather than code. Under the hood, Semlib handles complexities such as prompting, parsing, concurrency control, caching, and cost tracking. ```pycon >>> presidents = await prompt( ... "Who were the 39th through 42nd presidents of the United States?", ... return_type=Bare(list[str]) ... ) >>> await sort(presidents, by="right-leaning", reverse=True) # highest first ['Ronald Reagan', 'George H. W. Bush', 'Bill Clinton', 'Jimmy Carter'] >>> await find(presidents, by="former actor") 'Ronald Reagan' >>> await map( ... presidents, ... "How old was {} when he took office?", ... return_type=Bare(int), ... ) [52, 69, 64, 46] ``` ## Rationale Large language models are great at natural-language data processing and data analysis tasks, but when you have a large amount of data, you can't get high-quality results by just dumping all the data into a long-context LLM and asking it to complete a complex task in a single shot. Even with today's reasoning models and agents, this approach doesn't give great results. This library provides an alternative. You can structure your computation using the building blocks that Semlib provides: functional programming primitives upgraded to handle semantic operations. This approach has a number of benefits. **Quality.** By breaking down a sophisticated data processing task into simpler steps that are solved by today's LLMs, you can get higher-quality results, even in situations where today's LLMs might be capable of processing the data in a single shot and ending up with barely acceptable results. (example: analyzing support tickets in [Airline Support Report](examples/airline-support/)) **Feasibility.** Even long-context LLMs have limitations (e.g., 1M tokens in today's frontier models). Furthermore, performance often drops off with longer inputs. By breaking down the data processing task into smaller steps, you can handle arbitrary-sized data. (example: sorting an arbitrary number of arXiv papers in [arXiv Paper Recommendations](examples/arxiv-recommendations/)) **Latency.** By breaking down the computation into smaller pieces and structuring it using functional programming primitives like `map` and `reduce`, the parts of the computation can be run concurrently, reducing the latency of the overall computation. (example: tree reduce with O(log n) computation depth in [Disneyland Reviews Synthesis](examples/disneyland-reviews/)) **Cost.** By breaking down the computation into simpler sub-tasks, you can use smaller and cheaper models that are capable of solving those sub-tasks, which can reduce data processing costs. Furthermore, you can choose the model on a per-subtask basis, allowing you to further optimize costs. (example: using `gpt-4.1-nano` for the pre-filtering step in [arXiv Paper Recommendations](examples/arxiv-recommendations/)) **Security.** By breaking down the computation into tasks that simpler models can handle, you can use open models that you host yourself, allowing you to process sensitive data without having to trust a third party. (example: using `gpt-oss` and `qwen3` in [Resume Filtering](examples/resume-filtering/)) **Flexibility.** LLMs are great at certain tasks, like natural-language processing. They're not so great at other tasks, like multiplying numbers. Using Semlib, you can break down your data processing task into multiple steps, some of which use LLMs and others that just use regular old Python code, getting the best of both worlds. (example: Python code for filtering in [Resume Filtering](examples/resume-filtering/)) Read more about the rationale, the story behind this library, and related work in the [**blog post**](https://anishathalye.com/semlib/). ## `llms.txt` This documentation is available in [`llms.txt`](https://llmstxt.org/) format, a Markdown-based format optimized for LLMs and AI coding assistants. Two versions are available: - [`llms.txt`](https://semlib.anish.io/llms.txt) contains a description of the project and links to all documentation sections. - [`llms-full.txt`](https://semlib.anish.io/llms-full.txt) includes the complete content of all documentation pages. Note that this may be too large for some LLMs. These files are not currently automatically used by IDEs or coding agents, but you can provide the full text or link to them when using AI tools. ## Citation ```bibtex @misc{athalye:semlib, author = {Anish Athalye}, title = {{Semlib}: Semantic data processing for {Python}}, year = {2025}, howpublished = {\url{https://github.com/anishathalye/semlib}}, } ``` # Quickstart[¶](#quickstart) This notebook gives a quick overview of some Semlib functionality. For a more in-depth treatment, see the [Examples](../examples/). Using a Jupyter notebook or an asyncio REPL (`python3 -m asyncio`), import the library and start a session: In \[1\]: Copied! ``` from semlib import Bare, Session session = Session() ``` from semlib import Bare, Session session = Session() [prompt](../api/#semlib.Session.prompt) is a basic wrapper around prompting an LLM and optionally getting structured output: In \[2\]: Copied! ``` presidents: list[str] = await session.prompt( "Who were the 39th through 42nd presidents of the United States? Return the name only.", return_type=Bare(list[str]), ) presidents ``` presidents: list[str] = await session.prompt( "Who were the 39th through 42nd presidents of the United States? Return the name only.", return_type=Bare(list[str]), ) presidents Out\[2\]: ``` ['Jimmy Carter', 'Ronald Reagan', 'George H. W. Bush', 'Bill Clinton'] ``` [sort](../api/#semlib.Session.sort) sorts a list of items based on a ranking criterion: In \[3\]: Copied! ``` await session.sort(presidents, by="right-leaning", reverse=True) # highest first ``` await session.sort(presidents, by="right-leaning", reverse=True) # highest first Out\[3\]: ``` ['Ronald Reagan', 'George H. W. Bush', 'Bill Clinton', 'Jimmy Carter'] ``` [filter](../api/#semlib.Session.filter) filters a list of items based on a criterion: In \[4\]: Copied! ``` await session.filter(presidents, by="former actor", negate=True) ``` await session.filter(presidents, by="former actor", negate=True) Out\[4\]: ``` ['Jimmy Carter', 'George H. W. Bush', 'Bill Clinton'] ``` [map](../api/#semlib.Session.map) transforms a list of items based on a prompt template: In \[5\]: Copied! ``` ages: list[int] = await session.map( presidents, template="How old was {} when he took office?", return_type=Bare(int), ) ages ``` ages: list[int] = await session.map( presidents, template="How old was {} when he took office?", return_type=Bare(int), ) ages Out\[5\]: ``` [52, 69, 64, 46] ``` [total_cost](../api/#semlib.Session.total_cost) returns the total cost of all LLM calls made in the session: In \[6\]: Copied! ``` f"${session.total_cost():.3f}" ``` f"${session.total_cost():.3f}" Out\[6\]: ``` '$0.007' ``` To see the full list of supported functionality, take a look at the methods defined on [Session](../api/#semlib.Session) or take a look at the [examples](../examples/). # Library Design Semlib makes use of [type annotations](https://typing.python.org/en/latest/spec/index.html), [generics](https://typing.python.org/en/latest/spec/generics.html), and [overloads](https://typing.python.org/en/latest/spec/overload.html). Semlib does not perform runtime checks for invalid arguments that would be caught by a type checker, so you should use a type checker like [mypy](https://mypy-lang.org/) for code utilizing Semlib. Semlib follows a design pattern where many functions and methods (e.g., map) accept a `return_type` argument that simplifies working with [structured output](https://platform.openai.com/docs/guides/structured-outputs). When `return_type` is not provided, the desired output is assumed to be a string. When `return_type` is provided, Semlib will parse the LLM's output into the specified type. Semlib uses [Pydantic](https://pydantic.dev/) for parsing and validation, so you can pass a Pydantic model as the `return_type`. As an alternative, Semlib provides the Bare class, which can be used to mark a bare type like `int` or `list[float]` so that it can be used as the `return_type`. Semlib uses [LiteLLM](https://github.com/BerriAI/litellm) as an abstraction layer for LLMs. You can use any LLM [supported by LiteLLM](https://docs.litellm.ai/docs/providers) with Semlib, choosing your model using the `model` argument. For example, you can set the `ANTHROPIC_API_KEY` environment variable and use `model="anthropic/claude-sonnet-4-20250514"`, or you can set up [Ollama](https://ollama.com/) and use `model="ollama_chat/gpt-oss:20b"`. Note that not all models support structured output, so if you want to use the `return_type` feature, you should choose a model that supports structured output. Semlib currently uses GPT-4o as the default model (and thus requires `OPENAI_API_KEY` to be set). Semlib is optimized to be used in asynchronous code. Semlib also exports a synchronous interface (functions with the `_sync` suffix, which just wrap the asynchronous interface using [asyncio.run](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run)). If for some reason you want to use the synchronous interface within asynchronous code (not recommended), you can consider [nest_asyncio](https://github.com/erdewit/nest_asyncio). # Examples # Overview This section of the documentation has downloadable Jupyter notebooks that demonstrate various aspects of Semlib's functionality. - [**Disneyland Reviews Synthesis**](disneyland-reviews/). This notebook analyzes tens of thousands of Disneyland reviews to produce a list of top complaints, ordered by frequency, with citations. The notebook demonstrates a sophisticated multi-stage processing pipeline that includes map, reduce, apply, and non-semantic operations. - [**arXiv Paper Recommendations**](arxiv-recommendations/). This notebook analyzes the latest papers published on [arXiv](https://arxiv.org/) and surfaces reading recommendations based on your interests. The notebook demonstrates a pipeline that includes semantic filter and sort operations. - [**Airline Support Report**](airline-support/). This notebook analyzes a large number of airline support tickets and produces a report summarizing complaints. The notebook is directly translated from a DocETL example, allowing for a direct comparison between the systems. - [**Resume Filtering**](resume-filtering/). This notebook analyzes PDF resumes, extracts structured data, and filters the resumes according to a criteria. The notebook demonstrates processing PDFs using an open-source PDF-to-Markdown conversion tool, using Semlib with local models using [Ollama](https://ollama.com/), and using a series of maps to implement a model cascade in Semlib. # Airline Support Report[¶](#airline-support-report) This notebook analyzes a large number of airline support tickets and produces a report summarizing the complaints. This notebook mirrors the [airline support tutorial](https://data-people-group.github.io/blogs/2025/01/13/docwrangler/#what-is-docwrangler) in DocWrangler / DocETL ([Shankar et al., 2024](https://arxiv.org/abs/2410.12189)), allowing for direct comparison between DocETL and Semlib. The processing pipeline consists of two phases: - ([map](../../api/#semlib.Session.map)) In one LLM call *per complaint*, extract the chief complaint and other information as structured data. - ([apply](../../api/#semlib.Session.apply)) In a single LLM call, process all the structured data and produce a report summarizing the complaints. In DocETL's terminology, this is a "reduce" operation. Semlib's [reduce](../../api/#semlib.Session.reduce) mirrors the `reduce` higher-order function in functional programming, which combines elements from a list two at a time. Semlib's [apply](../../api/#semlib.Session.apply) method processes data in a single LLM call. This notebook uses the OpenAI API and costs about $0.05 to run. ## Install and configure Semlib[¶](#install-and-configure-semlib) If you don't already have Semlib installed, run: In \[ \]: Copied! ``` %pip install semlib ``` %pip install semlib We start by initializing a Semlib [Session](../../api/#semlib.Session). A session provides a context for performing Semlib operations. We configure the session to cache LLM responses on disk in `cache.db`, and we configure the default model to OpenAI's `gpt-4o-mini`. If your `OPENAI_API_KEY` is not already set in your environment, you can uncomment the line at the bottom of the next cell and set your API key there. If you wish to use a different LLM (e.g., `anthropic/claude-3-5-sonnet-20240620`), change the `model=` kwarg below. Make sure that the appropriate environment variable (e.g., `ANTHROPIC_API_KEY`) is set in your environment. You can use any LLM [supported by LiteLLM](https://docs.litellm.ai/docs/providers). In \[1\]: Copied! ``` from semlib import OnDiskCache, Session session = Session(cache=OnDiskCache("cache.db"), model="openai/gpt-4o-mini") # Uncomment the following lines and set your OpenAI API key if not already set in your environment # import os # os.environ["OPENAI_API_KEY"] = "..." ``` from semlib import OnDiskCache, Session session = Session(cache=OnDiskCache("cache.db"), model="openai/gpt-4o-mini") # Uncomment the following lines and set your OpenAI API key if not already set in your environment # import os # os.environ["OPENAI_API_KEY"] = "..." ## Download and preview dataset[¶](#download-and-preview-dataset) In \[2\]: Copied! ``` import json import urllib tickets = json.loads( urllib.request.urlopen( "https://gist.githubusercontent.com/anishathalye/9d13b58d7ea820b11bcbe8c7b5704649/raw/7f0ae3f64f1107553a2f1f425473899104b3d4f8/airline_support_chats_kaggle.json" ).read() ) ``` import json import urllib tickets = json.loads( urllib.request.urlopen( "https://gist.githubusercontent.com/anishathalye/9d13b58d7ea820b11bcbe8c7b5704649/raw/7f0ae3f64f1107553a2f1f425473899104b3d4f8/airline_support_chats_kaggle.json" ).read() ) In \[3\]: Copied! ``` print(f"Total number of tickets: {len(tickets)}") print() print(f"Example ticket text: {tickets[0]['text'][:400]}...") ``` print(f"Total number of tickets: {len(tickets)}") print() print(f"Example ticket text: {tickets[0]['text'][:400]}...") ``` Total number of tickets: 250 Example ticket text: Here is the description of my terrible experience with FlightNetwork: I needed to book round trip tickets to Copenhagen for 3 people. We searched Google flights for the best deal and found a round trip flight for $843 from Flight Network. Since we needed to book using 3 different credit cards, I called FlightNetwork for the booking to ensure that we all get the same price and availability. First, ... ``` ## Produce report[¶](#produce-report) ### Extract chief complaint and tag[¶](#extract-chief-complaint-and-tag) We begin by defining a Pydantic model describing the structure of the data we want to extract from each ticket. In \[4\]: Copied! ``` from pydantic import BaseModel class Complaint(BaseModel): chief_complaint: str complaint_category: str frustration_level: str ``` from pydantic import BaseModel class Complaint(BaseModel): chief_complaint: str complaint_category: str frustration_level: str Next, we use the [map](../../api/#semlib.Session.map) method to prompt an LLM per-ticket to extract the data we're looking for. Semlib supports structured data extraction via the `return_type=` kwarg, which supports both Pydantic models as well as [Bare](../../api/#semlib.Bare) types. We take this prompt template from the DocETL tutorial and translate it from the Jinja2 syntax to the Python lambda / f-string supported by Semlib. The following cell takes about 45 seconds, with the default concurrency level, to process 250 tickets. If you have high OpenAI rate limits, you can experiment with higher concurrency by passing `max_concurrency=` to the `Session` constructor above. In \[5\]: Copied! ``` complaints: list[Complaint] = await session.map( tickets, lambda ticket: f""" Describe the chief complaint from the user, including direct quotes from the user to capture their exact words. {ticket["text"]} Additionally, categorize the complaint into one of the following categories: pricing, tech issue, user error, customer service, or animal/pet assistance. Ensure that the category is in lowercase. Determine the level of frustration expressed by the user, using the values: high, medium, or low (all in lowercase). Format the output as follows: - Chief Complaint: [User's quoted complaint] - Complaint Category: [Selected category] - Frustration Level: [high/medium/low] Example output: - Chief Complaint: "I was charged extra fees that were not disclosed during booking." - Complaint Category: pricing - Frustration Level: high """.strip(), return_type=Complaint, ) ``` complaints: list[Complaint] = await session.map( tickets, lambda ticket: f""" Describe the chief complaint from the user, including direct quotes from the user to capture their exact words. {ticket["text"]} Additionally, categorize the complaint into one of the following categories: pricing, tech issue, user error, customer service, or animal/pet assistance. Ensure that the category is in lowercase. Determine the level of frustration expressed by the user, using the values: high, medium, or low (all in lowercase). Format the output as follows: - Chief Complaint: [User's quoted complaint] - Complaint Category: [Selected category] - Frustration Level: [high/medium/low] Example output: - Chief Complaint: "I was charged extra fees that were not disclosed during booking." - Complaint Category: pricing - Frustration Level: high """.strip(), return_type=Complaint, ) We can preview what one of the complaints looks like: In \[6\]: Copied! ``` complaints[0] ``` complaints[0] Out\[6\]: ``` Complaint(chief_complaint='"Overall absolutely terrible terrible nightmarish experience with FlightNetwork. Not worth the headache and lack of transparency."', complaint_category='customer service', frustration_level='high') ``` ### Combine individual complaints into a single report[¶](#combine-individual-complaints-into-a-single-report) Next, we use an LLM to analyze all the complaints together and generate a report. Here, we use the [apply](../../api/#semlib.Session.apply) method to apply an LLM prompt to a value (which is itself a thin wrapper around a simple LLM [prompt](../../api/#semlib.Session.prompt)). We take this prompt template from the DocETL tutorial as well; we translate it from the Jinja2 syntax as follows. First, we define a function for formatting a single complaint: In \[7\]: Copied! ``` def format_complaint(x: tuple[int, Complaint]) -> str: return f""" Ticket #{x[0]}: - Complaint Category: {x[1].complaint_category} - Frustration Level: {x[1].frustration_level} - Chief Complaint: {x[1].chief_complaint} """.strip() ``` def format_complaint(x: tuple[int, Complaint]) -> str: return f""" Ticket #{x[0]}: - Complaint Category: {x[1].complaint_category} - Frustration Level: {x[1].frustration_level} - Chief Complaint: {x[1].chief_complaint} """.strip() Next, we invoke the `apply` method, passing it: - `enumerate(complaints)`, which adds ticket numbers - A callable prompt `template`, which takes as an argument the list of (ticket number, complaint) tuples and returns a formatted string, which will be passed to the LLM Here, we do not specify a `return_type`, so we get back a string containing the report. In \[8\]: Copied! ``` report = await session.apply( enumerate(complaints), lambda c: f""" Here are some complaints found in the dataset: {"\n\n".join(map(format_complaint, c))} Summarize the common complaints across all tickets, and highlight how they differ across frustration levels. """.strip(), ) ``` report = await session.apply( enumerate(complaints), lambda c: f""" Here are some complaints found in the dataset: {"\\n\\n".join(map(format_complaint, c))} Summarize the common complaints across all tickets, and highlight how they differ across frustration levels. """.strip(), ) ### Cost analysis[¶](#cost-analysis) We used GPT-4o-mini, a low-cost LLM. We can check the total cost for the 251 LLM operations: In \[9\]: Copied! ``` f"${session.total_cost():.3f}" ``` f"${session.total_cost():.3f}" Out\[9\]: ``` '$0.037' ``` ## Final result[¶](#final-result) The final report is written in Markdown. We can render this Markdown in the notebook with the following: In \[10\]: Copied! ``` from IPython.display import display_markdown display_markdown(report, raw=True) ``` from IPython.display import display_markdown display_markdown(report, raw=True) ### Common Complaints summarized across all tickets:[¶](#common-complaints-summarized-across-all-tickets) 1. **Customer Service Issues**: - Many complaints center on poor customer service experiences, including unhelpful staff, inaccessible customer support (long wait times), and lack of clear communication regarding flight cancellations or changes. - High frustration level tickets often express feelings of being disregarded, misled, or treated poorly, while medium frustration tickets may highlight confusion or disappointment rather than outright anger. 1. **Flight Cancellations and Delays**: - A significant number of complaints relate to flight cancellations or delays and the inadequate responsiveness of airlines in addressing the situation. High frustration tickets emphasize the resulting inconveniences, such as missed connections and financial losses. - Medium frustration tickets highlight general irritation with the situation without an extreme emotional reaction. 1. **Pricing and Hidden Charges**: - Numerous complaints address unexpected costs or fees that passengers encountered, such as higher charges for luggage and unanticipated pricing discrepancies. - High frustration tickets denote a sense of being exploited or frustrated by the pricing policies, while medium frustration tickets express confusion or annoyance at the difficulty of navigating costs. 1. **Baggage Issues**: - Complaints about lost or damaged luggage are prevalent, with many expressing disbelief over the compensation or support available for such incidents. High frustration tickets articulate distress over valuable items lost or the impact on travel plans. - Medium frustration tickets may relay concern without reaching the level of significant anxiety. 1. **Tech Issues and Booking Complications**: - Several complaints relate to technical problems during the booking process or confusion regarding ticket rules and regulations, particularly around connecting flights and baggage handling. - High frustration tickets are marked by feelings of helplessness regarding procedural issues, while medium frustration tickets reveal confusion about technology but not an overwhelming sense of urgency. 1. **Seating Assignments**: - Problems with seating arrangements and failures to honor pre-allocated seats are reported frequently, often leading to significant frustrations when families are separated or downgraded. - High frustration complaints typically involve outright anger over the perceived injustices, while medium-level frustration might indicate disappointment or uncertainty. ### Frustration Levels:[¶](#frustration-levels) - **High Frustration Level**: - Complaints are intense, often expressing a mixture of anger, betrayal, or a feeling of being wronged. The customers frequently demand action, compensation, or vocalize a firm decision to avoid using that airline again. There’s often a strong emotional reaction tied to personal losses or stressful situations. - **Medium Frustration Level**: - These complaints convey dissatisfaction but often include a more measured tone. Customers are frustrated but might be seeking advice, expressing confusion about policies, or sharing experiences without an extreme emotional response. They may still be disappointed or annoyed but less likely than high frustration customers to express outrage or threat of action. - **Low Frustration Level**: - Complaints at this level generally reflect positive remarks or minor issues. Customers express satisfaction with some aspects of service, appreciate efforts made by staff, or confirm that their experiences were generally acceptable, aside from small complaints or suggestions for improvement. In summary, the overall complaints indicate major issues with customer service, flight management, and pricing policies, with the intensity of complaints varying significantly based on the frustration levels of the customers. High frustration tickets tend to express feelings of helplessness and betrayal, while medium frustration tickets reflect confusion and irritation without extreme emotional responses. Low frustration complaints were rare and often offered constructive feedback instead of outright complaints. # arXiv Paper Recommendations[¶](#arxiv-paper-recommendations) This notebook analyzes the latest papers published on [arXiv](https://arxiv.org/) and surfaces reading recommendations based on your interests. The analysis is a two-stage pipeline: - ([filter](../../api/#semlib.Session.filter)) Filter out irrelevant papers: ones on topics you're not interested in. - ([sort](../../api/#semlib.Session.sort)) Sort the remaining papers according to your research interests and other factors like reputation of authors. This notebook uses the OpenAI API and costs about $2.50 to run. You can reduce costs by sub-sampling the dataset or using cheaper models. ## Install and configure dependencies[¶](#install-and-configure-dependencies) In addition to Semlib, this notebook uses [arxiv.py](https://github.com/lukasschwab/arxiv.py). In \[ \]: Copied! ``` %pip install semlib arxiv ``` %pip install semlib arxiv We start by initializing a Semlib [Session](../../api/#semlib.Session). A session provides a context for performing Semlib operations. We configure the session to cache LLM responses on disk in `cache.db`. This notebook uses OpenAI models. If your `OPENAI_API_KEY` is not already set in your environment, you can uncomment the line at the bottom of the next cell and set your API key there. In \[1\]: Copied! ``` import semlib from semlib import OnDiskCache, Session session = Session(cache=OnDiskCache("cache.db")) # Uncomment the following lines and set your OpenAI API key if not already set in your environment # import os # os.environ["OPENAI_API_KEY"] = "..." ``` import semlib from semlib import OnDiskCache, Session session = Session(cache=OnDiskCache("cache.db")) # Uncomment the following lines and set your OpenAI API key if not already set in your environment # import os # os.environ["OPENAI_API_KEY"] = "..." ## Download and preview data[¶](#download-and-preview-data) We start by defining a function to fetch arXiv paper metadata given a set of categories along with a date range. In \[2\]: Copied! ``` from datetime import date import arxiv def get_papers(categories: list[str], start_date: date, end_date: date) -> list[arxiv.Result]: query_cat = " OR ".join(f"cat:{cat}" for cat in categories) query_date = f"submittedDate:[{start_date.strftime('%Y%m%d')} TO {end_date.strftime('%Y%m%d')}]" query = f"({query_cat}) AND {query_date}" search = arxiv.Search(query) client = arxiv.Client() return list(client.results(search)) ``` from datetime import date import arxiv def get_papers(categories: list[str], start_date: date, end_date: date) -> list\[arxiv.Result\]: query_cat = " OR ".join(f"cat:{cat}" for cat in categories) query_date = f"submittedDate:[{start_date.strftime('%Y%m%d')} TO {end_date.strftime('%Y%m%d')}]" query = f"({query_cat}) AND {query_date}" search = arxiv.Search(query) client = arxiv.Client() return list(client.results(search)) Next, we fetch a batch of papers. Feel free to edit the list of [categories](https://arxiv.org/category_taxonomy) to match your interests, or update the date range to get the most recent papers at the time you're running this notebook. In \[17\]: Copied! ``` papers = get_papers(["cs.AI", "cs.LG"], date(2025, 8, 29), date(2025, 9, 4)) print(f"Number of papers: {len(papers)}\n") print(f"Example title: {papers[0].title}\n") print(f"Example abstract: {papers[0].summary[:400].replace('\n', ' ')}...") ``` papers = get_papers(["cs.AI", "cs.LG"], date(2025, 8, 29), date(2025, 9, 4)) print(f"Number of papers: {len(papers)}\\n") print(f"Example title: {papers[0].title}\\n") print(f"Example abstract: {papers[0].summary[:400].replace('\\n', ' ')}...") ``` Number of papers: 907 Example title: MyGO: Memory Yielding Generative Offline-consolidation for Lifelong Learning Systems Example abstract: Continual or Lifelong Learning aims to develop models capable of acquiring new knowledge from a sequence of tasks without catastrophically forgetting what has been learned before. Existing approaches often rely on storing samples from previous tasks (experience replay) or employing complex regularization terms to protect learned weights. However, these methods face challenges related to data priva... ``` ## Find most relevant papers[¶](#find-most-relevant-papers) ### Filter out irrelevant papers[¶](#filter-out-irrelevant-papers) We start off by *filtering out* papers that are irrelevant, given a list of topics you're definitely *not* interested in. We do this with Semlib's [filter](../../api/#semlib.Session.filter) method. By default, this method *keeps* items matching a criteria, but sometimes, LLMs perform better on an "inverse" binary classification problem, like "is this paper about any of the following topics?", rather than "is this paper NOT about any of the following topics?", so this method supports a `negate=True` argument where it keeps all items that do *not* match the criteria given to the LLM. For this rough filtering stage, we use a low-cost model, `gpt-4.1-nano`. Feel free to edit the list of topics in the prompt below to match your preferences. The following cell takes about 30 seconds to run with the default `max_concurrency` level (feel free to change it in the `Session` constructor above). In \[5\]: Copied! ``` relevant = await session.filter( papers, template=lambda p: f""" Your task is to determine if the following academic paper is on any of the following topics. Paper title: {p.title} Paper abstract: {p.summary} It the paper about any of the following topics? - Medicine - Healthcare - Biology - Chemistry - Physics """.strip(), model="openai/gpt-4.1-nano", negate=True, ) ``` relevant = await session.filter( papers, template=lambda p: f""" Your task is to determine if the following academic paper is on any of the following topics. Paper title: {p.title} Paper abstract: {p.summary} It the paper about any of the following topics? - Medicine - Healthcare - Biology - Chemistry - Physics """.strip(), model="openai/gpt-4.1-nano", negate=True, ) We can see how many papers we managed to filter out, and also take a look at some of the irrelevant papers, to make sure the filter worked well. In \[16\]: Copied! ``` print(f"Filtered out {len(papers) - len(relevant)} irrelevant papers, including:") for paper in list({i.title for i in papers} - {i.title for i in relevant})[:5]: print(f"- {paper}") ``` print(f"Filtered out {len(papers) - len(relevant)} irrelevant papers, including:") for paper in list({i.title for i in papers} - {i.title for i in relevant})\[:5\]: print(f"- {paper}") ``` Filtered out 228 irrelevant papers, including: - Abex-rat: Synergizing Abstractive Augmentation and Adversarial Training for Classification of Occupational Accident Reports - Deep Self-knowledge Distillation: A hierarchical supervised learning for coronary artery segmentation - Quantum-Enhanced Natural Language Generation: A Multi-Model Framework with Hybrid Quantum-Classical Architectures - Multimodal learning of melt pool dynamics in laser powder bed fusion - Temporally-Aware Diffusion Model for Brain Progression Modelling with Bidirectional Temporal Regularisation ``` ### Sort papers by relevance[¶](#sort-papers-by-relevance) Next, we sort the list of papers by relevance using Semlib's [sort](../../api/#semlib.Session.sort) method, which sorts items by using an LLM to perform pairwise comparisons. The API supports framing the comparison [task](../../api/compare/#semlib.compare.Task) in a number of ways. Here, we ask the LLM to choose the better fit between options "A" and "B". We start by defining the prompt for the LLM. We put the static instructions at the start of the LLM prompt to take advantage of [prompt caching](https://openai.com/index/api-prompt-caching/). Feel free to edit the list of interests to match yours. In \[7\]: Copied! ``` COMPARISON_TEMPLATE = """ You are a research assistant. Help me pick a research paper to read, based on what is most relevant to my interests and what is most likely to be high-quality work based on the title, authors, and abstract. You will be given context on my interests, and two paper abstracts. My research interests include: - Machine learning and artificial intelligence - Systems - Security - Formal methods Here is paper Option A: Here is paper Option B: Choose the option (either A or B) that is more relevant to my interests and likely to be a high-quality work. """.strip() ``` COMPARISON_TEMPLATE = """ You are a research assistant. Help me pick a research paper to read, based on what is most relevant to my interests and what is most likely to be high-quality work based on the title, authors, and abstract. You will be given context on my interests, and two paper abstracts. My research interests include: - Machine learning and artificial intelligence - Systems - Security - Formal methods Here is paper Option A: {} Here is paper Option B: {} Choose the option (either A or B) that is more relevant to my interests and likely to be a high-quality work. """.strip() The [sort](../../api/#semlib.Session.sort) API supports a variety of alternatives for supplying a prompt template, such as providing a callable that takes a pair of items and returns a string. In this notebook, we supply a `to_str` function that converts items to a string representation, and a prompt template that is a format string with two placeholders. Next, we define the `to_str` function, which converts a paper (metadata object) to a string. In \[8\]: Copied! ``` def to_str(paper: arxiv.Result) -> str: return f""" Title: {paper.title} Authors: {", ".join(author.name for author in paper.authors)} Abstract: {paper.summary} """.strip() ``` def to_str(paper: arxiv.Result) -> str: return f""" Title: {paper.title} Authors: {", ".join(author.name for author in paper.authors)} Abstract: {paper.summary} """.strip() Finally, we're ready to call `sort()`. Earlier, we used the `gpt-4.1-nano` model to filter papers because that's an easy task and this model is cheaper. For the following sort operation, we use the `gpt-4.1-mini` model. Semlib lets you choose the model on a per-operation basis to control the cost-quality-latency tradeoff. Here, we use the Quicksort algorithm for an average O(n log n) LLM calls. By default, sort performs O(n^2) LLM calls to achieve a higher-quality result. The following cell takes about 7 minutes to run with the default `max_concurrency` setting. In \[9\]: Copied! ``` sorted_results = await session.sort( relevant, to_str=to_str, template=COMPARISON_TEMPLATE, algorithm=semlib.sort.QuickSort(randomized=False), model="openai/gpt-4.1-mini", ) ``` sorted_results = await session.sort( relevant, to_str=to_str, template=COMPARISON_TEMPLATE, algorithm=semlib.sort.QuickSort(randomized=False), model="openai/gpt-4.1-mini", ) ### Cost analysis[¶](#cost-analysis) In \[10\]: Copied! ``` f"${session.total_cost():.2f}" ``` f"${session.total_cost():.2f}" Out\[10\]: ``` '$2.53' ``` ## Results[¶](#results) `sorted_results` now contains the papers ordered from least aligned to most aligned with your research interests. Let's take a look at some of the top results. ### Most aligned[¶](#most-aligned) In \[11\]: Copied! ``` def format_paper(paper: arxiv.Result) -> str: return f"""{paper.title} ({", ".join(author.name for author in paper.authors)}) {paper.entry_id} {paper.summary[:200].replace("\n", " ")}...""" for i, p in enumerate(reversed(sorted_results[-5:])): print(f"{i + 1}. {format_paper(p)}\n\n") ``` def format_paper(paper: arxiv.Result) -> str: return f"""{paper.title} ({", ".join(author.name for author in paper.authors)}) {paper.entry_id} {paper.summary[:200].replace("\\n", " ")}...""" for i, p in enumerate(reversed(sorted_results[-5:])): print(f"{i + 1}. {format_paper(p)}\\n\\n") ``` 1. Enabling Trustworthy Federated Learning via Remote Attestation for Mitigating Byzantine Threats (Chaoyu Zhang, Heng Jin, Shanghao Shi, Hexuan Yu, Sydney Johns, Y. Thomas Hou, Wenjing Lou) http://arxiv.org/abs/2509.00634v1 Federated Learning (FL) has gained significant attention for its privacy-preserving capabilities, enabling distributed devices to collaboratively train a global model without sharing raw data. However... 2. zkLoRA: Fine-Tuning Large Language Models with Verifiable Security via Zero-Knowledge Proofs (Guofu Liao, Taotao Wang, Shengli Zhang, Jiqun Zhang, Shi Long, Dacheng Tao) http://arxiv.org/abs/2508.21393v1 Fine-tuning large language models (LLMs) is crucial for adapting them to specific tasks, yet it remains computationally demanding and raises concerns about correctness and privacy, particularly in unt... 3. An Information-Flow Perspective on Explainability Requirements: Specification and Verification (Bernd Finkbeiner, Hadar Frenkel, Julian Siber) http://arxiv.org/abs/2509.01479v1 Explainable systems expose information about why certain observed effects are happening to the agents interacting with them. We argue that this constitutes a positive flow of information that needs to... 4. Poisoned at Scale: A Scalable Audit Uncovers Hidden Scam Endpoints in Production LLMs (Zhiyang Chen, Tara Saba, Xun Deng, Xujie Si, Fan Long) http://arxiv.org/abs/2509.02372v1 Large Language Models (LLMs) have become critical to modern software development, but their reliance on internet datasets for training introduces a significant security risk: the absorption and reprod... 5. ANNIE: Be Careful of Your Robots (Yiyang Huang, Zixuan Wang, Zishen Wan, Yapeng Tian, Haobo Xu, Yinhe Han, Yiming Gan) http://arxiv.org/abs/2509.03383v1 The integration of vision-language-action (VLA) models into embodied AI (EAI) robots is rapidly advancing their ability to perform complex, long-horizon tasks in humancentric environments. However, EA... ``` ### Least aligned[¶](#least-aligned) In \[12\]: Copied! ``` for i, p in enumerate(sorted_results[:5]): print(f"{i + 1}. {format_paper(p)}\n\n") ``` for i, p in enumerate(sorted_results[:5]): print(f"{i + 1}. {format_paper(p)}\\n\\n") ``` 1. Content and Engagement Trends in COVID-19 YouTube Videos: Evidence from the Late Pandemic (Nirmalya Thakur, Madeline D Hartel, Lane Michael Boden, Dallas Enriquez, Boston Joyner Ricks) http://arxiv.org/abs/2509.01954v1 This work investigated about 10,000 COVID-19-related YouTube videos published between January 2023 and October 2024 to evaluate how temporal, lexical, linguistic, and structural factors influenced eng... 2. Generative KI für TA (Wolfgang Eppler, Reinhard Heil) http://arxiv.org/abs/2509.02053v1 Many scientists use generative AI in their scientific work. People working in technology assessment (TA) are no exception. TA's approach to generative AI is twofold: on the one hand, generative AI is ... 3. Why it is worth making an effort with GenAI (Yvonne Rogers) http://arxiv.org/abs/2509.00852v1 Students routinely use ChatGPT and the like now to help them with their homework, such as writing an essay. It takes less effort to complete and is easier to do than by hand. It can even produce as go... 4. Community-Centered Spatial Intelligence for Climate Adaptation at Nova Scotia's Eastern Shore (Gabriel Spadon, Oladapo Oyebode, Camilo M. Botero, Tushar Sharma, Floris Goerlandt, Ronald Pelot) http://arxiv.org/abs/2509.01845v1 This paper presents an overview of a human-centered initiative aimed at strengthening climate resilience along Nova Scotia's Eastern Shore. This region, a collection of rural villages with deep ties t... 5. Quantifying the Social Costs of Power Outages and Restoration Disparities Across Four U.S. Hurricanes (Xiangpeng Li, Junwei Ma, Bo Li, Ali Mostafavi) http://arxiv.org/abs/2509.02653v1 The multifaceted nature of disaster impact shows that densely populated areas contribute more to aggregate burden, while sparsely populated but heavily affected regions suffer disproportionately at th... ``` # Disneyland Reviews Synthesis[¶](#disneyland-reviews-synthesis) This notebook analyzes tens of thousands of [reviews of Disneyland](https://www.kaggle.com/datasets/arushchillar/disneyland-reviews) to produce a list of the top complaints, ordered by frequency, with citations. The analysis is implemented with the following pipeline: - ([map](../../api/#semlib.Session.map)) Extract criticism from each review, using one LLM call per review. - (non-semantic filter) Filter out reviews that do not contain any complaints. - ([reduce](../../api/#semlib.Session.reduce)) Combine criticism into a single summary, using O(n) LLM calls, using an associative combinator to improve latency (to O(log n) with unlimited concurrency). - ([apply](../../api/#semlib.Session.apply)) Parse the criticism summary into separate items. - ([map](../../api/#semlib.Session.map)) Compute citations: for each review, determine which of the criticisms are supported by the review. - (non-semantic sort) Order the criticism by frequency. This notebook uses the OpenAI API and costs about $5 to run. If you want to reduce costs and/or running time, you can sub-sample the data (e.g., `reviews = reviews[:100]`) or switch to a faster and cheaper model (e.g., `gpt-4.1-nano`). ## Install and configure Semlib[¶](#install-and-configure-semlib) If you don't already have Semlib installed, run: In \[ \]: Copied! ``` %pip install semlib ``` %pip install semlib We start by initializing a Semlib [Session](../../api/#semlib.Session). A session provides a context for performing Semlib operations. We configure the session to cache LLM responses on disk in `cache.db`, set the default model to OpenAI's `gpt-4o-mini`, and set max concurrency to 100. We use a high `max_concurrency=` kwarg because we'll be making tens of thousands of LLM queries, and this will speed up the computation. If you run into issues due to your OpenAI rate limits, you can decrease this number. You can also edit the notebook to sub-sample data if you want to reduce costs and running time. If your `OPENAI_API_KEY` is not already set in your environment, you can uncomment the line at the bottom of the next cell and set your API key there. If you wish to use a different LLM (e.g., `anthropic/claude-3-5-sonnet-20240620`), change the `model=` kwarg below. Make sure that the appropriate environment variable (e.g., `ANTHROPIC_API_KEY`) is set in your environment. You can use any LLM [supported by LiteLLM](https://docs.litellm.ai/docs/providers). In \[1\]: Copied! ``` from semlib import Bare, Box, OnDiskCache, Session session = Session(cache=OnDiskCache("cache.db"), model="openai/gpt-4o-mini", max_concurrency=100) # Uncomment the following lines and set your OpenAI API key if not already set in your environment # import os # os.environ["OPENAI_API_KEY"] = "..." ``` from semlib import Bare, Box, OnDiskCache, Session session = Session(cache=OnDiskCache("cache.db"), model="openai/gpt-4o-mini", max_concurrency=100) # Uncomment the following lines and set your OpenAI API key if not already set in your environment # import os # os.environ["OPENAI_API_KEY"] = "..." ## Download and preview dataset[¶](#download-and-preview-dataset) In \[ \]: Copied! ``` !curl -s -L -o disneyland-reviews.zip https://www.kaggle.com/api/v1/datasets/download/arushchillar/disneyland-reviews !unzip -q -o disneyland-reviews.zip ``` !curl -s -L -o disneyland-reviews.zip https://www.kaggle.com/api/v1/datasets/download/arushchillar/disneyland-reviews !unzip -q -o disneyland-reviews.zip In this notebook, we only consider the reviews for Disneyland California. In \[2\]: Copied! ``` import csv with open("DisneylandReviews.csv", encoding="latin-1") as f_in: csv_file = csv.reader(f_in) header = next(csv_file) reviews = [dict(zip(header, row, strict=False)) for row in csv_file] reviews = [r for r in reviews if r["Branch"] == "Disneyland_California"] print(f"Loaded {len(reviews)} reviews\n") print(f"Example review: {reviews[0]['Review_Text']}") ``` import csv with open("DisneylandReviews.csv", encoding="latin-1") as f_in: csv_file = csv.reader(f_in) header = next(csv_file) reviews = [dict(zip(header, row, strict=False)) for row in csv_file] reviews = \[r for r in reviews if r["Branch"] == "Disneyland_California"\] print(f"Loaded {len(reviews)} reviews\\n") print(f"Example review: {reviews[0]['Review_Text']}") ``` Loaded 19406 reviews Example review: This place has always been and forever will be special. The feeling you get entering the park, seeing the characters and different attractions is just priceless. This is definitely a dream trip for all ages, especially young kids. Spend the money and go to Disneyland, you will NOT regret it ``` ## Synthesize criticism[¶](#synthesize-criticism) ### Extract criticism from each review[¶](#extract-criticism-from-each-review) We use the [map](../../api/#semlib.Session.map) method to extract criticism, if any, from each review. The following cell takes about 3 minutes to run, with the `max_concurrency=100` set at the top of this notebook. In \[3\]: Copied! ``` extracted_criticism = await session.map( reviews, template=lambda r: f""" Extract any criticism from this review of Disneyland California, as a succinct bulleted list. If there is none, respond '(none)'. {r["Review_Text"]} """.strip(), ) ``` extracted_criticism = await session.map( reviews, template=lambda r: f""" Extract any criticism from this review of Disneyland California, as a succinct bulleted list. If there is none, respond '(none)'. {r["Review_Text"]} """.strip(), ) We can see what some of these items looks like. In \[4\]: Copied! ``` print(f"Criticism from review 0:\n{extracted_criticism[0]}\n") print(f"Criticism from review 1:\n{extracted_criticism[1]}") ``` print(f"Criticism from review 0:\\n{extracted_criticism[0]}\\n") print(f"Criticism from review 1:\\n{extracted_criticism[1]}") ``` Criticism from review 0: (none) Criticism from review 1: - Nothing is cheap; it may be expensive for some visitors. - Restrictions on items allowed at entry gates (selfie sticks, glass refill bottles, etc). ``` ### Filter out reviews without complaints[¶](#filter-out-reviews-without-complaints) For downstream analysis, we only want to process the non-empty criticism. We already asked the LLM to output "(none)" when the review contains no criticism, so we don't need a semantic operator for this step. In \[5\]: Copied! ``` criticism = [text for text in extracted_criticism if text != "(none)"] print(f"{len(criticism)} out of {len(reviews)} reviews contain criticism") ``` criticism = [text for text in extracted_criticism if text != "(none)"] print(f"{len(criticism)} out of {len(reviews)} reviews contain criticism") ``` 13293 out of 19406 reviews contain criticism ``` ### Combine criticism into a single summary[¶](#combine-criticism-into-a-single-summary) Now, we have tens of thousands of individual complaints. We want to combine this into a single summary of criticism. One option is to dump all the complaints into a long-context LLM and ask it to produce a summary in one go; this could work, but some research has shown that LLMs can exhibit poor performance when processing such large amounts of data in a single shot (this is one motivation behind research works like DocETL ([Shankar et al., 2024](https://arxiv.org/abs/2410.12189)) and LOTUS ([Patel et al., 2024](https://arxiv.org/abs/2407.11418))). In this notebook, we use Semlib's [reduce](../../api/#semlib.Session.reduce) operator to process the data, which is analogous to the [reduce or fold higher-order function](https://en.wikipedia.org/wiki/Fold_%28higher-order_function%29) in functional programming. The reduce operator takes as an argument a template that explains how to combine an "accumulator" with a single item, and applies this template n times to process each item in the dataset. As a performance optimization, if we implement an associative template, then we can pass `associative=True` to the call to Semlib's `reduce()`, and then rather than issue O(n) LLM calls serially, Semlib will arrange the computation in a tree structure, with depth O(log n), which has a huge impact on decreasing the latency of the operation. For computations that are "naturally" associative (e.g., addition), this is straightforward; in our case, the individual items in our data to process (the leaf nodes in the reduction tree) are individual complaints, while internal nodes are summaries of complaints. To support this, we "tag" leaf nodes in the data with [Box](../../api/#semlib.Box), and write a template that behaves differently based on whether inputs are leaf nodes or internal nodes of the reduction tree. In the case of this particular task, merging individual complaints into a running summary, a single LLM prompt (regardless of whether it's handling two individual complaints, a complaint and a summary, or two summaries) would probably work fine, but for other more complex tasks, this complexity may be necessary, so this tutorial demonstrates how to do it. In \[6\]: Copied! ``` # common formatting instructions in all cases FORMATTING_INSTRUCTIONS = """Ensure that each bullet point is as succinct as possible, representing a single logical idea. Write separate criticisms as separate bullet points. Combine any similar criticism into the same bullet point. Output your answer as a single-level bulleted list with no other formatting.""" def merge_template(a: str | Box[str], b: str | Box[str]) -> str: if isinstance(a, Box) and isinstance(b, Box): # both are leaf nodes in the reduction tree (raw criticism) return f""" Consider the following two lists of criticisms about Disneyland California, and return a bulleted list summarizing the criticism from the two lists. {a.value} {b.value} {FORMATTING_INSTRUCTIONS} """.strip() if not isinstance(a, Box) and not isinstance(b, Box): # both are internal nodes in the reduction tree (summaries) return f""" Consider the following two lists summarizing criticism about Disneyland California, combine them into a single summary of criticism. {a} {b} {FORMATTING_INSTRUCTIONS} """.strip() # when the tree isn't perfectly balanced, there will be cases where one input is a leaf node and the other is an internal node # so we need to handle the case where one input is a raw criticism and the other is a summary if isinstance(a, Box) and not isinstance(b, Box): feedback = b criticism = a.value if not isinstance(a, Box) and isinstance(b, Box): feedback = a criticism = b.value return f""" Consider the following summary of criticism about Disneyland California, and the following criticism from a single individual. Merge that individual's criticism into the summary. {feedback} {criticism} {FORMATTING_INSTRUCTIONS} """.strip() ``` # common formatting instructions in all cases FORMATTING_INSTRUCTIONS = """Ensure that each bullet point is as succinct as possible, representing a single logical idea. Write separate criticisms as separate bullet points. Combine any similar criticism into the same bullet point. Output your answer as a single-level bulleted list with no other formatting.""" def merge_template(a: str | Box[str], b: str | Box[str]) -> str: if isinstance(a, Box) and isinstance(b, Box): # both are leaf nodes in the reduction tree (raw criticism) return f""" Consider the following two lists of criticisms about Disneyland California, and return a bulleted list summarizing the criticism from the two lists. {a.value} {b.value} {FORMATTING_INSTRUCTIONS} """.strip() if not isinstance(a, Box) and not isinstance(b, Box): # both are internal nodes in the reduction tree (summaries) return f""" Consider the following two lists summarizing criticism about Disneyland California, combine them into a single summary of criticism. {a} {b} {FORMATTING_INSTRUCTIONS} """.strip() # when the tree isn't perfectly balanced, there will be cases where one input is a leaf node and the other is an internal node # so we need to handle the case where one input is a raw criticism and the other is a summary if isinstance(a, Box) and not isinstance(b, Box): feedback = b criticism = a.value if not isinstance(a, Box) and isinstance(b, Box): feedback = a criticism = b.value return f""" Consider the following summary of criticism about Disneyland California, and the following criticism from a single individual. Merge that individual's criticism into the summary. {feedback} {criticism} {FORMATTING_INSTRUCTIONS} """.strip() After we've implemented the template, we can kick off our `reduce()` call. We wrap all of the inputs with [Box](../../api/#semlib.Box), and we pass `associative=True` to the operator. The following cell takes about 5 minutes to run, with the `max_concurrency=100` set at the top of this notebook. In \[7\]: Copied! ``` merged_criticism = await session.reduce(map(Box, criticism), template=merge_template, associative=True) ``` merged_criticism = await session.reduce(map(Box, criticism), template=merge_template, associative=True) ### Parse criticism summary into individual items[¶](#parse-criticism-summary-into-individual-items) The `merged_criticism` is a single string that contains a bulleted list. If the LLM performed well and outputted a well-formed Markdown list with each item on a single line, we could parse the result into separate items using some basic Python code, like `[item.strip("- ").strip() for item in merged_criticism.split("\n") if item.strip().startswith("-")]`. This could break if the feedback was hard-wrapped across multiple lines or used a different bullet point format (like `*` bullets instead of `-` bullets), for example. So instead of parsing using Python code, here we parse the response using an LLM. We want a `list[str]`, so we use Semlib's [Bare](../../api/#semlib.Bare) to annotate the return type. In \[8\]: Copied! ``` criticism_items = await session.apply( merged_criticism, "Turn this list into a JSON array of strings.\n\n{}", return_type=Bare(list[str]) ) ``` criticism_items = await session.apply( merged_criticism, "Turn this list into a JSON array of strings.\\n\\n{}", return_type=Bare(list[str]) ) For display purposes, we switch to numbered items. In \[9\]: Copied! ``` numbered_criticism = "\n".join(f"{i + 1:d}. {item}" for i, item in enumerate(criticism_items)) print(numbered_criticism) ``` numbered_criticism = "\\n".join(f"{i + 1:d}. {item}" for i, item in enumerate(criticism_items)) print(numbered_criticism) ``` 1. High admission prices and expensive food and merchandise contribute to perceptions of Disneyland as overpriced and unaffordable for average families. 2. Long wait times for rides and character interactions often exceed 30 minutes to over 2 hours, with some attractions reaching up to 6 hours, leading to visitor frustration. 3. Severe overcrowding year-round complicates navigation, creates discomfort, and can leave parks declared 'full.' 4. The FastPass system is largely ineffective due to limited availability, poor management, and confusion among guests. 5. Many attractions cater primarily to younger children, resulting in limited options for older kids and adults, leading to dissatisfaction. 6. Frequent ride breakdowns and closures occur without prior notice, causing disappointment and safety concerns. 7. Food quality is often low, with few healthy options available, and high prices contribute to visitor dissatisfaction. 8. Maintenance and cleanliness issues, such as overflowing trash cans, dirty restrooms, and unkempt areas, detract from the overall experience. 9. Limited seating areas, quiet spots, and insufficient shading result in discomfort and difficulty resting throughout the park. 10. Reports of declining staff enthusiasm, rudeness, and poor service diminish overall service quality and atmosphere. 11. Limited character interactions and excessive wait times for meet-and-greets leave guests feeling underwhelmed. 12. Safety concerns related to poor lighting, crowd management, and stroller congestion affect visitor comfort. 13. Fireworks shows are limited to weekends or canceled without clear communication, disappointing guests. 14. Guests feel the experience lacks the magic of past visits and is increasingly commercialized with a focus on merchandise sales. 15. Parking navigation issues and traffic congestion during peak hours add to travel frustrations. 16. The presence of homeless individuals near the park contributes to visitor discomfort. 17. Many visitors can only experience a limited number of rides per day, necessitating extensive planning that detracts from enjoyment. 18. Unpleasant odors and sewer issues further diminish the guest experience. 19. Insufficient accommodations for individuals with disabilities raise concerns for families with young children. 20. Poor communication regarding events and logistics leads to guest frustration and confusion. 21. Major construction disrupts access to attractions and amenities, impacting the overall experience. 22. Excessive walking and long park hours lead to fatigue and detract from enjoyment. 23. Limited viewing spots and seating for shows result in discomfort and detract from entertainment experiences. 24. Many merchandise offerings are repetitive and of disappointing quality, impacting the shopping experience. 25. Weather discomfort, including excessive heat during the day and cold temperatures at night, affects visitor comfort. 26. Cumbersome security checks and bag searches add frustration to the entry process. 27. The World of Color show is often considered underwhelming and fails to meet expectations. 28. Recommendations suggest visiting during off-peak times for better experiences with lighter crowds. ``` ### Compute citations[¶](#compute-citations) Now, we want to know: for each item in the summary of criticism, what reviews back up that item? To figure this out, we use a [map](../../api/#semlib.Session.map), asking an LLM to determine which criticism items are substantiated by a review, for each review. In \[10\]: Copied! ``` def citation_template(review): return f""" Which of the following pieces of criticism, if any, about Disneyland California is substantiated by the following review? {numbered_criticism} {review["Review_Text"]} Respond with a list of the numbers of the pieces of criticism that are substantiated by the review. """.strip() ``` def citation_template(review): return f""" Which of the following pieces of criticism, if any, about Disneyland California is substantiated by the following review? {numbered_criticism} {review["Review_Text"]} Respond with a list of the numbers of the pieces of criticism that are substantiated by the review. """.strip() Again, we use the [Bare](../../api/#semlib.Bare) annotation to get back structured data. The following cell takes about 2 minutes to run, with the `max_concurrency=100` set at the top of this notebook. In \[11\]: Copied! ``` per_review_citations = await session.map(reviews, citation_template, return_type=Bare(list[int])) ``` per_review_citations = await session.map(reviews, citation_template, return_type=Bare(list[int])) Now, we can see, for example, which criticisms from the summary are substantiated by a particular review. In \[12\]: Copied! ``` print(per_review_citations[1]) print(reviews[1]["Review_Text"]) ``` print(per_review_citations[1]) print(reviews[1]["Review_Text"]) ``` [1, 4] A great day of simple fun and thrills. Bring cash, nothing is cheap, but we knew that it's Disney. But they are great letting you bring in your own food, drinks, etc but read the list closely, we list several items at the entry gates (selfy sticks, glass refill bottles, etc). It is worth buying the photo pass and fastpass. Have fun! ``` ### Sort criticism by frequency[¶](#sort-criticism-by-frequency) Now that we have citations on a per-review basis, we can figure out citations on a per-criticism basis, and that'll let us sort by frequency, to find the most frequent criticisms of Disneyland California. In \[13\]: Copied! ``` # map from criticism index (1-based) to set of review indices (0-based) that cite it citations: dict[int, set[int]] = {i + 1: set() for i in range(len(criticism_items))} for review_idx, cited in enumerate(per_review_citations): for feedback_idx in cited: # sometimes the structured output isn't perfect, and it includes numbers that are out of range, # so we filter those out here if 1 <= feedback_idx <= len(criticism_items): citations[feedback_idx].add(review_idx) ``` # map from criticism index (1-based) to set of review indices (0-based) that cite it citations: dict\[int, set[int]\] = {i + 1: set() for i in range(len(criticism_items))} for review_idx, cited in enumerate(per_review_citations): for feedback_idx in cited: # sometimes the structured output isn't perfect, and it includes numbers that are out of range, # so we filter those out here if 1 \<= feedback_idx \<= len(criticism_items): citations[feedback_idx].add(review_idx) We don't need a [semantic sort](../../api/#semlib.Session.sort) for this step, a regular old Python sort will do. In \[14\]: Copied! ``` by_count = [ (i[1], i[2]) for i in sorted(((len(citations[i + 1]), feedback, i) for i, feedback in enumerate(criticism_items)), reverse=True) ] ``` by_count = \[ (i[1], i[2]) for i in sorted(((len(citations[i + 1]), feedback, i) for i, feedback in enumerate(criticism_items)), reverse=True) \] ## Results[¶](#results) Finally, we can look at the results. Here, we look at the top 10 criticisms, showing the count of people, the summary of the criticism, a couple citations (so we could follow up by looking at individual reviews), and a single review that substantiates that criticism. In \[15\]: Copied! ``` for feedback, i in by_count[:10]: sorted_citations = sorted(citations[i + 1]) cite_str = f"{', '.join([str(c) for c in sorted_citations][:3])}, ..." print(f"({len(citations[i + 1])}) {feedback} [{cite_str}]\n") some_cite = sorted_citations[min(i * 10, len(sorted_citations) - 1)] # get some variety print(f" Review {some_cite}: {reviews[some_cite]['Review_Text']}\n\n") ``` for feedback, i in by_count\[:10\]: sorted_citations = sorted(citations[i + 1]) cite_str = f"{', '.join([str(c) for c in sorted_citations][:3])}, ..." print(f"({len(citations[i + 1])}) {feedback} [{cite_str}]\\n") some_cite = sorted_citations[min(i * 10, len(sorted_citations) - 1)] # get some variety print(f" Review {some_cite}: {reviews[some_cite]['Review_Text']}\\n\\n") ``` (7267) Severe overcrowding year-round complicates navigation, creates discomfort, and can leave parks declared 'full.' [2, 3, 9, ...] Review 76: We had a great time at Disneyland. It was nice to see Fantasmic after missing the last few times although we didn't have a great spot. We did our first trip 12 years ago and it's likely our last with the kids so it was special. We stayed off site but close enough to walk. Pros: It's Disney. Love Pirates, Indiana Jones, Space Mountain,etc. We got to see a little of Star Wars Land which looks pretty impressive. Cons: It was very crowded. The Castle was being refurbished . Not an issue for us, but if you had a 6 year old Princess lover it could be a problem. The current fireworks are not nearly as good as the old ones. Kind of odd for us.I think this is our 6th trip so we know the park very well and move around it all the time. Our kids had the app down to get quick time estimates. We found some times when ride times were low and were generally out of park mid afternoon but there early and late. We quickly found out it was best to hit rope drop a the park not on early hours to get in some quick rides. (5619) Long wait times for rides and character interactions often exceed 30 minutes to over 2 hours, with some attractions reaching up to 6 hours, leading to visitor frustration. [2, 7, 10, ...] Review 43: Long lines 2 1 2 hrs for one ride was a little much for active kiddos. But the ride was great once we got on it.. (5184) High admission prices and expensive food and merchandise contribute to perceptions of Disneyland as overpriced and unaffordable for average families. [1, 8, 9, ...] Review 1: A great day of simple fun and thrills. Bring cash, nothing is cheap, but we knew that it's Disney. But they are great letting you bring in your own food, drinks, etc but read the list closely, we list several items at the entry gates (selfy sticks, glass refill bottles, etc). It is worth buying the photo pass and fastpass. Have fun! (4532) The FastPass system is largely ineffective due to limited availability, poor management, and confusion among guests. [1, 2, 3, ...] Review 104: Spring break crowds were noticeable in mid march. Max pass worth the extra money for the popular rides. The Magic Mix Fireworks was a great show! We arrived at the Fantasmic Fast Pass Area about 15 minutes before the time on the fast pass and the line was already very long and we were not really able to see once we made it to the fast past viewing area. Really enjoyed the Minnie and Friends Breakfast at the Plaza Inn. (2063) Reports of declining staff enthusiasm, rudeness, and poor service diminish overall service quality and atmosphere. [6, 16, 30, ...] Review 766: Disneyland hasn't change much. It been 15 years ago. But was nice to visit again. I love just spend time with family. But park is okay need improvement. (1987) Food quality is often low, with few healthy options available, and high prices contribute to visitor dissatisfaction. [12, 18, 26, ...] Review 612: The happiest place on earth as they say. Disneyland has a magical atmosphere especially with the fancy names of rides and restaurants. Was slightly disappointing that the characters don't pop up more often like they used to years ago. But it's quite organised that they have characters in different locations at particular times for autograph signing. Food is very expensive and there isn't enough choice for vegetarians considering people visit from all over the world. Overall, disney has a way to bring out the kid in everyone and we deffo made the most of our 6th trip (1959) Many visitors can only experience a limited number of rides per day, necessitating extensive planning that detracts from enjoyment. [3, 7, 18, ...] Review 1506: We spent 2 days at Disneyland and California Adventure park on a hopper tickets and despite the crowds had a great time. In fact the crowds never really were an issue.We used the Disneyland App extensively and I would highly recommend this.We also used Max pass which allowed us to do fast pass selections on the mobile device at a higher frequency then standard this was awesome and allowed us to do heaps of rides we otherwise would have done.Also we used e tickets on entry and didn t do the print thing. The other piece of advice was with kids 2 days is required and if you want to sit down for lunch do it by mid day latest!What a fantastic time we had (1872) Limited character interactions and excessive wait times for meet-and-greets leave guests feeling underwhelmed. [51, 67, 82, ...] Review 1041: Los Angeles Disneyland located in Anaheim, it s not as big as the other Disneyland around the world, but it s still fun and the photo package is cheaper compared to other Disneyland, some of the characters meet and greet are not stationary, they walk around the park, so please do approach them if you see them in the park and ask for a photograph or signature, have fun! (1615) Frequent ride breakdowns and closures occur without prior notice, causing disappointment and safety concerns. [12, 48, 51, ...] Review 551: It s Disney, of course it s going to be great!!! Even the bad experience we had with rides breaking down was more than put right by Guest Relations. A magical time at the happiest place on earth! (1540) Major construction disrupts access to attractions and amenities, impacting the overall experience. [12, 22, 25, ...] Review 2184: Most of the cast members were average or slightly rude. I'm used to the happy cast members of the Magic Kingdom that want to brighten everyone's day. We went on a November 2nd (Thursday) the park was not overly busy. Be aware: The line to search your bag is long and the line to enter took much longer than expected. It would be nice to see some updates. Many parts of the park appear dated. We met Snow White she had chipped nail polish. Just not what I expected. We loved the Royal Theater performances!!! All the actors were fabulous!!! Highly recommend it!Overall there were some cool things to experience, but no need to come back. We will go to Disney World's Magic Kingdom from now on. ``` # Resume Filtering[¶](#resume-filtering) This notebook analyzes a [collection of resumes](https://www.kaggle.com/datasets/snehaanbhawal/resume-dataset), extracting out the education info (university, degree, etc.), and filters resumes according to a criteria. This notebook shows how to use Semlib with local models using [Ollama](https://ollama.com/). It employs a model cascade to extract information using a higher-capacity model and then turns that into structured data using a smaller model (to work around [this bug](https://github.com/ollama/ollama/issues/11691) with `gpt-oss` in Ollama). The processing is implemented with the following pipeline: - (third-party tool) Convert PDF to Markdown with [Marker](https://github.com/datalab-to/marker). - ([map](../../api/#semlib.Session.map)) Use `gpt-oss:20b` to extract education information from resume Markdown content. - ([map](../../api/#semlib.Session.map)) Use `qwen3:8b` to turn the education information into structured data. - (non-semantic filter) Filter out the resumes that have master's degrees. ## Install and configure dependencies[¶](#install-and-configure-dependencies) ### Ollama[¶](#ollama) This notebook relies on [Ollama](https://ollama.com/), which you can use to run LLMs on your local machine. Download Ollama and start it before you proceed. We use two different open-source LLMs, [gpt-oss](https://ollama.com/library/gpt-oss) and [qwen3](https://ollama.com/library/qwen3). You will need a reasonably powerful machine to run these models locally. If they fail to run, or they run too slowly, you can consider trying smaller open-source models instead, or use a hosted model (e.g., via the OpenAI API) to run this notebook. First, we make sure these models are present on your local machine (if not, it's a **20 GB download**). In \[ \]: Copied! ``` !ollama pull gpt-oss:20b !ollama pull qwen3:8b ``` !ollama pull gpt-oss:20b !ollama pull qwen3:8b ### Python packages[¶](#python-packages) In addition to Semlib, this notebook uses [Marker](https://github.com/datalab-to/marker). In \[ \]: Copied! ``` %pip install semlib marker-pdf ``` %pip install semlib marker-pdf We start by initializing a Semlib [Session](../../api/#semlib.Session). A session provides a context for performing Semlib operations. We configure the session to cache LLM responses on disk in `cache.db`, and we configure the default model to the open-source `gpt-oss:20b` via the local provider `ollama_chat/`. In \[1\]: Copied! ``` from semlib import OnDiskCache, Session session = Session(cache=OnDiskCache("cache.db"), model="ollama_chat/gpt-oss:20b") ``` from semlib import OnDiskCache, Session session = Session(cache=OnDiskCache("cache.db"), model="ollama_chat/gpt-oss:20b") ## Download and preprocess dataset[¶](#download-and-preprocess-dataset) In \[ \]: Copied! ``` !curl -s -L -o resume-dataset.zip https://www.kaggle.com/api/v1/datasets/download/snehaanbhawal/resume-dataset !unzip -q -o resume-dataset.zip ``` !curl -s -L -o resume-dataset.zip https://www.kaggle.com/api/v1/datasets/download/snehaanbhawal/resume-dataset !unzip -q -o resume-dataset.zip This dataset contains resumes in PDF format (feel free to examine them in your PDF viewer: the resumes will be in the `data/` directory). We use [Marker](https://github.com/VikParuchuri/marker) to convert these to Markdown. We sub-sample the resumes to reduce processing time, considering only 10 resumes in the dataset. The first time you use Marker, it needs to download some ML models (up to about **3 GB** of data). The following cell takes about 2 minutes to run on an M3 MacBook Pro. In \[ \]: Copied! ``` import os from marker.converters.pdf import PdfConverter from marker.models import create_model_dict converter = PdfConverter( artifact_dict=create_model_dict(), ) directory = "data/data/ENGINEERING" files = sorted(os.listdir(directory))[:10] texts = [] for file in files: rendered = converter(os.path.join(directory, file)) texts.append(rendered.markdown) ``` import os from marker.converters.pdf import PdfConverter from marker.models import create_model_dict converter = PdfConverter( artifact_dict=create_model_dict(), ) directory = "data/data/ENGINEERING" files = sorted(os.listdir(directory))[:10] texts = [] for file in files: rendered = converter(os.path.join(directory, file)) texts.append(rendered.markdown) Now, we can preview what one of these resume texts looks like. We note that there are some parsing errors (the PDFs are not high-quality to begin with, and there are additional errors introduced in the conversion to Markdown). LLMs end up being pretty effective at processing data like this, though. In \[3\]: Copied! ``` print(f"{texts[0][:1000]}...") ``` print(f"{texts[0][:1000]}...") ``` ## ENGINEERINGLABTECHNICIAN Career Focus Mymain objectivein seeking employment withTriumphActuation Systems Inc. is to work in a professionalatmosphere whereIcan utilize my skillsand continueto gain experiencein theaerospaceindustry to advanceinmy career. ProfessionalExperience EngineeringLab TechnicianOct 2016 to Current CompanyNameï¼ City , State - Responsiblefor testing various seatstructures to meetspecificcertification requirements. Â - Maintain and calibratetest instruments to ensuretesting capabilitiesare maintained. - Ensure dataiscaptured and recorded correctly forcertification test reports. - Dutiesalso dynamictestset-up and staticsuitetesting. EngineeringLab Technician, Sr. Specialist Apr 2012 to Oct 2016 CompanyNameï¼ City , State - Utilized skills learned fromLabViewCourse 1 training to constructand maintainLabViewVI programs. - Responsiblefor fabricating and maintaining hydraulic/electricaltestequipment to complete developmentand qualification programs. - Apply engine... ``` ## Filter resumes[¶](#filter-resumes) ### Extract education information[¶](#extract-education-information) We begin with a semantic [map](../../api/#semlib.Session.map) to extract education information from the resume. We use the high-capacity `gpt-oss:20b` model (set as the default in the Session constructor above). At this time, there is a bug which prevents structured outputs from this model in Ollama, so we just use it to extract a textual description of the education information as a first step. The following cell takes about 2 minutes to run on an M3 MacBook Pro. In \[4\]: Copied! ``` all_education_texts = await session.map( texts, """ Given a resume, extract the university, graduation year, degree, and area of study for the most advanced degree the individual has. If some of this information is not present, omit it. If no university education is present, return "(none)". Resume: {} """.strip(), ) ``` all_education_texts = await session.map( texts, """ Given a resume, extract the university, graduation year, degree, and area of study for the most advanced degree the individual has. If some of this information is not present, omit it. If no university education is present, return "(none)". Resume: {} """.strip(), ) Some of the resumes don't have education information present, in which case the LLM returns "(none)". We filter these out using a non-semantic filter, and preview what one of the education infos looks like. In \[5\]: Copied! ``` education_texts = [i for i in all_education_texts if i != "(none)"] print(education_texts[0]) ``` education_texts = [i for i in all_education_texts if i != "(none)"] print(education_texts[0]) ``` Forsyth Technical Community College, 2011, Associates, Applied Science, Electronics Engineering ``` ### Extract structured data[¶](#extract-structured-data) We begin by defining a Pydantic model that describes the structured data we want to get. For the `degree` field, we use a `typing.Literal` annotation to restrict the set of values. In \[6\]: Copied! ``` from typing import Literal import pydantic class EducationInfo(pydantic.BaseModel): university: str | None graduation_year: int | None degree: Literal["Associate", "Bachelor", "Master", "Doctorate"] | None area: str | None ``` from typing import Literal import pydantic class EducationInfo(pydantic.BaseModel): university: str | None graduation_year: int | None degree: Literal["Associate", "Bachelor", "Master", "Doctorate"] | None area: str | None Now, we call `qwen3:8b`, a smaller-capacity LLM (but one that supports structured outputs in Ollama), to convert the text-based descriptions of educational information to the structured data type we defined above. The following cell takes about 30 seconds to run on an M3 MacBook Pro. In \[7\]: Copied! ``` educations = await session.map( education_texts, """ Given the following description of an individual's education, extract the university, graduation year, degree, and area of study. {} """.strip(), return_type=EducationInfo, model="ollama_chat/qwen3:8b", ) ``` educations = await session.map( education_texts, """ Given the following description of an individual's education, extract the university, graduation year, degree, and area of study. {} """.strip(), return_type=EducationInfo, model="ollama_chat/qwen3:8b", ) We can take a look at what one of these items looks like. In \[8\]: Copied! ``` educations[0] ``` educations[0] Out\[8\]: ``` EducationInfo(university='Forsyth Technical Community College', graduation_year=2011, degree='Associate', area='Electronics Engineering') ``` ### Filter for resumes with master's degrees[¶](#filter-for-resumes-with-masters-degrees) As a first step, we construct an `all_educations` list that contains `EducationInfo`s that correspond to the resumes in `files` and `texts` (the `educations` doesn't necessarily contain these, as we filtered out the "(none)" cases). In \[9\]: Copied! ``` all_educations: list[EducationInfo | None] = [] i = 0 for text in all_education_texts: if text != "(none)": all_educations.append(educations[i]) i += 1 else: all_educations.append(None) masters = [] for file, edu in zip(files, all_educations, strict=False): if edu is not None and edu.degree == "Master": masters.append((file, edu)) ``` all_educations: list[EducationInfo | None] = [] i = 0 for text in all_education_texts: if text != "(none)": all_educations.append(educations[i]) i += 1 else: all_educations.append(None) masters = [] for file, edu in zip(files, all_educations, strict=False): if edu is not None and edu.degree == "Master": masters.append((file, edu)) ## Results[¶](#results) In \[10\]: Copied! ``` print(f"Found {len(masters)} resumes with a Master's degree:\n") for file, edu in masters: print(f"- {os.path.join(directory, file)}: {edu.university}, {edu.graduation_year}, {edu.area}") ``` print(f"Found {len(masters)} resumes with a Master's degree:\\n") for file, edu in masters: print(f"- {os.path.join(directory, file)}: {edu.university}, {edu.graduation_year}, {edu.area}") ``` Found 6 resumes with a Master's degree: - data/data/ENGINEERING/10624813.pdf: Union College, 1989, Computer Science - data/data/ENGINEERING/10985403.pdf: Illinois Institute of Technology, 2017, Mechanical & Aerospace Engineering - data/data/ENGINEERING/11890896.pdf: San Francisco State University, 2007, Decision Sciences - data/data/ENGINEERING/11981094.pdf: Illinois Institute of Technology, None, Computer Science - data/data/ENGINEERING/12011623.pdf: University of New Hampshire, 2017, Analytics - data/data/ENGINEERING/12022566.pdf: University at Buffalo, 2014, Industrial Engineering ``` # API Reference # semlib ## Session Bases: `Sort`, `Filter`, `Extrema`, `Find`, `Apply`, `Reduce` A session provides a context for performing Semlib operations. Sessions provide defaults (e.g., default model), manage caching, implement concurrency control, and track costs. All of a Session's methods have analogous standalone functions (e.g., the Session.map method has an analogous map function). It is recommended to use a Session for any non-trivial use of the semlib library. Source code in `src/semlib/session.py` ```python class Session(Sort, Filter, Extrema, Find, Apply, Reduce): """A session provides a context for performing Semlib operations. Sessions provide defaults (e.g., default model), manage caching, implement concurrency control, and track costs. All of a Session's methods have analogous standalone functions (e.g., the [Session.map][semlib.map.Map.map] method has an analogous [map][semlib.map.map] function). It is recommended to use a Session for any non-trivial use of the semlib library. """ ``` ### model ```python model: str ``` Get the current model being used for completions. Returns: | Type | Description | | ----- | --------------------------- | | `str` | The model name as a string. | ### __init__ ```python __init__( *, model: str | None = None, max_concurrency: int | None = None, cache: QueryCache | None = None, ) ``` Initialize. Parameters: | Name | Type | Description | Default | | ----------------- | ------------ | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model` | \`str | None\` | The language model to use for completions. If not specified, uses the value from the SEMLIB_DEFAULT_MODEL environment variable, or falls back to the default model (currently "openai/gpt-4o"). This is used as the model argument for litellm unless overridden in individual method calls. | | `max_concurrency` | \`int | None\` | Maximum number of concurrent API requests. If not specified, uses the value from the SEMLIB_MAX_CONCURRENCY environment variable, or defaults to 10 for most models, or 1 for Ollama models. | | `cache` | \`QueryCache | None\` | If provided, this is used to cache LLM responses to avoid redundant API calls. | Raises: | Type | Description | | ------------ | ------------------------------------------------------------- | | `ValueError` | If max_concurrency is provided but is not a positive integer. | Source code in `src/semlib/_internal/base.py` ```python def __init__( self, *, model: str | None = None, max_concurrency: int | None = None, cache: QueryCache | None = None ): """Initialize. Args: model: The language model to use for completions. If not specified, uses the value from the `SEMLIB_DEFAULT_MODEL` environment variable, or falls back to the default model (currently `"openai/gpt-4o"`). This is used as the `model` argument for [litellm](https://docs.litellm.ai/docs/providers) unless overridden in individual method calls. max_concurrency: Maximum number of concurrent API requests. If not specified, uses the value from the `SEMLIB_MAX_CONCURRENCY` environment variable, or defaults to 10 for most models, or 1 for Ollama models. cache: If provided, this is used to cache LLM responses to avoid redundant API calls. Raises: ValueError: If `max_concurrency` is provided but is not a positive integer. """ self._model = model or os.getenv("SEMLIB_DEFAULT_MODEL") or DEFAULT_MODEL if max_concurrency is None and (env_max_concurrency := os.getenv("SEMLIB_MAX_CONCURRENCY")) is not None: try: max_concurrency = int(env_max_concurrency) except ValueError: msg = "SEMLIB_MAX_CONCURRENCY must be an integer" raise ValueError(msg) from None self._max_concurrency = parse_max_concurrency(max_concurrency, self._model) self._sem = asyncio.Semaphore(self._max_concurrency) self._pending_requests: set[bytes] = set() self._cond = asyncio.Condition() # for pending requests deduplication self._cache = cache self._total_cost: float = 0.0 ``` ### apply ```python apply[T, U: BaseModel]( item: T, /, template: str | Callable[[T], str], *, return_type: type[U], model: str | None = None, ) -> U ``` ```python apply[T, U]( item: T, /, template: str | Callable[[T], str], *, return_type: Bare[U], model: str | None = None, ) -> U ``` ```python apply[T]( item: T, /, template: str | Callable[[T], str], *, return_type: None = None, model: str | None = None, ) -> str ``` Apply a language model prompt to a single item. This method formats a prompt template with the given item, sends it to the language model, and returns the response. The response can be returned as a raw string, parsed into a [Pydantic](https://pydantic.dev/) model, or extracted as a bare value using the Bare marker. This method is a simple wrapper around prompt. Parameters: | Name | Type | Description | Default | | ------------- | --------- | ---------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `item` | `T` | The item to apply the template to. | *required* | | `template` | \`str | Callable\[[T], str\]\` | A template to format with the item. This can be either a string template with a single positional placeholder, or a callable that takes the item and returns a formatted string. | | `return_type` | \`type[U] | Bare[V] | None\` | | `model` | \`str | None\` | If specified, overrides the default model for this call. | Returns: | Type | Description | | ---- | ----------- | | \`U | V | Raises: | Type | Description | | ----------------- | ------------------------------------------------------------------------------------------------------------ | | `ValidationError` | If return_type is a Pydantic model or a Bare type and the response cannot be parsed into the specified type. | Examples: Basic usage: ```pycon >>> await session.apply( ... [1, 2, 3, 4, 5], ... template="What is the sum of these numbers: {}?", ... return_type=Bare(int), ... ) 15 ``` Source code in `src/semlib/apply.py` ```python async def apply[T, U: BaseModel, V]( self, item: T, /, template: str | Callable[[T], str], *, return_type: type[U] | Bare[V] | None = None, model: str | None = None, ) -> U | V | str: """Apply a language model prompt to a single item. This method formats a prompt template with the given item, sends it to the language model, and returns the response. The response can be returned as a raw string, parsed into a [Pydantic](https://pydantic.dev/) model, or extracted as a bare value using the [Bare][semlib.bare.Bare] marker. This method is a simple wrapper around [prompt][semlib._internal.base.Base.prompt]. Args: item: The item to apply the `template` to. template: A template to format with the item. This can be either a string template with a single positional placeholder, or a callable that takes the item and returns a formatted string. return_type: If not specified, the response is returned as a raw string. If a Pydantic model class is provided, the response is parsed into an instance of that model. If a [Bare][semlib.bare.Bare] instance is provided, a single value of the specified type is extracted from the response. model: If specified, overrides the default model for this call. Returns: The language model's response in the format specified by return_type. Raises: ValidationError: If `return_type` is a Pydantic model or a Bare type and the response cannot be parsed into the specified type. Examples: Basic usage: >>> await session.apply( ... [1, 2, 3, 4, 5], ... template="What is the sum of these numbers: {}?", ... return_type=Bare(int), ... ) 15 """ formatter = template.format if isinstance(template, str) else template model = model if model is not None else self._model return await self.prompt(formatter(item), return_type=return_type, model=model) ``` ### clear_cache ```python clear_cache() -> None ``` Clear the internal cache of LLM responses, if caching is enabled. Source code in `src/semlib/_internal/base.py` ```python def clear_cache(self) -> None: """Clear the internal cache of LLM responses, if caching is enabled.""" if self._cache is not None: self._cache.clear() ``` ### compare ```python compare[T]( a: T, b: T, /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, ) -> Order ``` Compare two items. This method uses a language model to compare two items and determine the relative ordering of the two items. The comparison can be customized by specifying either a criteria to compare by, or a custom prompt template. The comparison task can be framed in a number of ways (choosing the greater item, lesser item, or the ordering). Parameters: | Name | Type | Description | Default | | ---------- | ---------------------- | --------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `a` | `T` | The first item to compare. | *required* | | `b` | `T` | The second item to compare. | *required* | | `by` | \`str | None\` | A criteria specifying what aspect to compare by. If this is provided, template cannot be provided. | | `to_str` | \`Callable\[[T], str\] | None\` | If specified, used to convert items to string representation. Otherewise, uses str() on each item. If this is provided, a callable template cannot be provided. | | `template` | \`str | Callable\[[T, T], str\] | None\` | | `task` | \`Task | str | None\` | | `model` | \`str | None\` | If specified, overrides the default model for this call. | Returns: | Type | Description | | ------- | ------------------------------ | | `Order` | The ordering of the two items. | Raises: | Type | Description | | ----------------- | ---------------------------------- | | `ValidationError` | If parsing the LLM response fails. | Examples: Basic comparison: ```pycon >>> await session.compare("twelve", "seventy two") ``` Custom criteria: ```pycon >>> await session.compare("California condor", "Bald eagle", by="wingspan") ``` Custom template and task: ```pycon >>> await session.compare( ... "proton", ... "electron", ... template="Which is smaller, (A) {} or (B) {}?", ... task=Task.CHOOSE_LESSER, ... ) ``` Source code in `src/semlib/compare.py` ```python async def compare[T]( self, a: T, b: T, /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, ) -> Order: """Compare two items. This method uses a language model to compare two items and determine the relative ordering of the two items. The comparison can be customized by specifying either a criteria to compare by, or a custom prompt template. The comparison task can be framed in a number of ways (choosing the greater item, lesser item, or the ordering). Args: a: The first item to compare. b: The second item to compare. by: A criteria specifying what aspect to compare by. If this is provided, `template` cannot be provided. to_str: If specified, used to convert items to string representation. Otherewise, uses `str()` on each item. If this is provided, a callable template cannot be provided. template: A custom prompt template for the comparison. Must be either a string template with two positional placeholders, or a callable that takes two items and returns a formatted string. If this is provided, `by` cannot be provided. task: The type of comparison task that is being performed in `template`. This allows for writing the template in the most convenient way possible (e.g., in some scenarios, it's easier to specify a criteria for which item is lesser, and in others, it's easier to specify a criteria for which item is greater). If this is provided, a custom `template` must also be provided. Defaults to [Task.CHOOSE_GREATER][semlib.compare.Task.CHOOSE_GREATER] if not specified. model: If specified, overrides the default model for this call. Returns: The ordering of the two items. Raises: ValidationError: If parsing the LLM response fails. Examples: Basic comparison: >>> await session.compare("twelve", "seventy two") Custom criteria: >>> await session.compare("California condor", "Bald eagle", by="wingspan") Custom template and task: >>> await session.compare( ... "proton", ... "electron", ... template="Which is smaller, (A) {} or (B) {}?", ... task=Task.CHOOSE_LESSER, ... ) """ if task is not None and task not in {Task.CHOOSE_GREATER, Task.CHOOSE_GREATER_OR_ABSTAIN} and template is None: msg = "if 'task' is not CHOOSE_GREATER or CHOOSE_GREATER_OR_ABSTAIN, 'template' must also be provided" raise ValueError(msg) if template is not None: if callable(template) and to_str is not None: msg = "cannot provide 'to_str' when a template function is provided" raise ValueError(msg) if by is not None: msg = "cannot provide 'by' when a custom template is provided" raise ValueError(msg) to_str = to_str if to_str is not None else str if task is None: task = Task.CHOOSE_GREATER elif isinstance(task, str): task = Task(task) model = model if model is not None else self._model if isinstance(template, str): prompt = template.format(to_str(a), to_str(b)) elif template is not None: # callable prompt = template(a, b) elif by is None: prompt = _DEFAULT_TEMPLATE.format(a=to_str(a), b=to_str(b)) else: prompt = _DEFAULT_TEMPLATE_BY.format(criteria=by, a=to_str(a), b=to_str(b)) response = await self.prompt( prompt, model=model, return_type=_RETURN_TYPE_BY_TASK[task], ) match task: case Task.COMPARE: strict_compare_result = cast("_StrictCompareResult", response) return strict_compare_result.order.to_order() case Task.COMPARE_OR_ABSTAIN: compare_result = cast("_CompareResult", response) return compare_result.order case Task.CHOOSE_GREATER: strict_choose_result = cast("_StrictChooseResult", response) match strict_choose_result.choice: case _StrictChoice.A: return Order.GREATER case _StrictChoice.B: return Order.LESS case Task.CHOOSE_GREATER_OR_ABSTAIN: choose_result = cast("_ChooseResult", response) match choose_result.choice: case _Choice.A: return Order.GREATER case _Choice.B: return Order.LESS case _Choice.NEITHER: return Order.NEITHER case Task.CHOOSE_LESSER: strict_choose_result = cast("_StrictChooseResult", response) match strict_choose_result.choice: case _StrictChoice.A: return Order.LESS case _StrictChoice.B: return Order.GREATER case Task.CHOOSE_LESSER_OR_ABSTAIN: choose_result = cast("_ChooseResult", response) match choose_result.choice: case _Choice.A: return Order.LESS case _Choice.B: return Order.GREATER case _Choice.NEITHER: return Order.NEITHER ``` ### filter ```python filter[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, ) -> list[T] ``` Filter an iterable based on a criteria. This method is analogous to Python's built-in [`filter`](https://docs.python.org/3/library/functions.html#filter) function. Parameters: | Name | Type | Description | Default | | ---------- | ---------------------- | ------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `iterable` | `Iterable[T]` | The collection of items to filter. | *required* | | `by` | \`str | None\` | A criteria specifying a predicate to filter by. If this is provided, template cannot be provided. | | `to_str` | \`Callable\[[T], str\] | None\` | If specified, used to convert items to string representation. Otherewise, uses str() on each item. If this is provided, a callable template cannot be provided. | | `template` | \`str | Callable\[[T], str\] | None\` | | `negate` | `bool` | If True, keep items that do not match the criteria. If False, keep items that match the criteria. | `False` | | `model` | \`str | None\` | If specified, overrides the default model for this call. | Returns: | Type | Description | | --------- | ------------------------------------------------------------------------- | | `list[T]` | A new list containing items from the iterable if they match the criteria. | Raises: | Type | Description | | ----------------- | ---------------------------------- | | `ValidationError` | If parsing any LLM response fails. | Examples: Basic filter: ```pycon >>> await session.filter(["Tom Hanks", "Tom Cruise", "Tom Brady"], by="actor?") ['Tom Hanks', 'Tom Cruise'] ``` Custom template: ```pycon >>> await session.filter( ... [(123, 321), (384, 483), (134, 431)], ... template=lambda pair: f"Is {pair[0]} backwards {pair[1]}?", ... negate=True, ... ) [(384, 483)] ``` Source code in `src/semlib/filter.py` ```python async def filter[T]( self, iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, ) -> list[T]: """Filter an iterable based on a criteria. This method is analogous to Python's built-in [`filter`](https://docs.python.org/3/library/functions.html#filter) function. Args: iterable: The collection of items to filter. by: A criteria specifying a predicate to filter by. If this is provided, `template` cannot be provided. to_str: If specified, used to convert items to string representation. Otherewise, uses `str()` on each item. If this is provided, a callable template cannot be provided. template: A custom prompt template for predicates. Must be either a string template with a single positional placeholder, or a callable that takes an item and returns a formatted string. If this is provided, `by` cannot be provided. negate: If `True`, keep items that do **not** match the criteria. If `False`, keep items that match the criteria. model: If specified, overrides the default model for this call. Returns: A new list containing items from the iterable if they match the criteria. Raises: ValidationError: If parsing any LLM response fails. Examples: Basic filter: >>> await session.filter(["Tom Hanks", "Tom Cruise", "Tom Brady"], by="actor?") ['Tom Hanks', 'Tom Cruise'] Custom template: >>> await session.filter( ... [(123, 321), (384, 483), (134, 431)], ... template=lambda pair: f"Is {pair[0]} backwards {pair[1]}?", ... negate=True, ... ) [(384, 483)] """ if template is None: if by is None: msg = "must specify either 'by' or 'template'" raise ValueError(msg) else: if callable(template) and to_str is not None: msg = "cannot provide 'to_str' when a template function is provided" raise ValueError(msg) if by is not None: msg = "cannot provide 'by' when a custom template is provided" raise ValueError(msg) to_str = to_str if to_str is not None else str if template is None: def map_template(item: T, /) -> str: return _DEFAULT_TEMPLATE.format(by=by or "", item=to_str(item)) elif isinstance(template, str): def map_template(item: T, /) -> str: return template.format(to_str(item)) else: # callable map_template = template decisions = await self.map(iterable, map_template, return_type=_Decision, model=model) return [ item for item, decision in zip(iterable, decisions, strict=False) if ((not decision.decision) if negate else decision.decision) ] ``` ### find ```python find[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, ) -> T | None ``` Find an item in an iterable based on a criteria. This method searches through the provided iterable and returns some item (not necessarily the first) that matches the specified criteria. Parameters: | Name | Type | Description | Default | | ---------- | ---------------------- | ------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `iterable` | `Iterable[T]` | The collection of items to search. | *required* | | `by` | \`str | None\` | A criteria specifying a predicate to search by. If this is provided, template cannot be provided. | | `to_str` | \`Callable\[[T], str\] | None\` | If specified, used to convert items to string representation. Otherewise, uses str() on each item. If this is provided, a callable template cannot be provided. | | `template` | \`str | Callable\[[T], str\] | None\` | | `negate` | `bool` | If True, find an item that does not match the criteria. If False, find an item that does match the criteria. | `False` | | `model` | \`str | None\` | If specified, overrides the default model for this call. | Returns: | Type | Description | | ---- | ----------- | | \`T | None\` | Raises: | Type | Description | | ----------------- | ---------------------------------- | | `ValidationError` | If parsing any LLM response fails. | Examples: Basic find: ```pycon >>> await session.find(["Tom Hanks", "Tom Cruise", "Tom Brady"], by="actor?") 'Tom Cruise' # nondeterministic, could also return "Tom Hanks" ``` Custom template: ```pycon >>> await session.find( ... [(123, 321), (384, 483), (134, 431)], ... template=lambda pair: f"Is {pair[0]} backwards {pair[1]}?", ... negate=True, ... ) (384, 483) ``` Source code in `src/semlib/find.py` ```python async def find[T]( self, iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, ) -> T | None: """Find an item in an iterable based on a criteria. This method searches through the provided iterable and returns some item (not necessarily the first) that matches the specified criteria. Args: iterable: The collection of items to search. by: A criteria specifying a predicate to search by. If this is provided, `template` cannot be provided. to_str: If specified, used to convert items to string representation. Otherewise, uses `str()` on each item. If this is provided, a callable template cannot be provided. template: A custom prompt template for predicates. Must be either a string template with a single positional placeholder, or a callable that takes an item and returns a formatted string. If this is provided, `by` cannot be provided. negate: If `True`, find an item that does **not** match the criteria. If `False`, find an item that does match the criteria. model: If specified, overrides the default model for this call. Returns: An item from the iterable if it matches the criteria, or `None` if no such item is found. Raises: ValidationError: If parsing any LLM response fails. Examples: Basic find: >>> await session.find(["Tom Hanks", "Tom Cruise", "Tom Brady"], by="actor?") 'Tom Cruise' # nondeterministic, could also return "Tom Hanks" Custom template: >>> await session.find( ... [(123, 321), (384, 483), (134, 431)], ... template=lambda pair: f"Is {pair[0]} backwards {pair[1]}?", ... negate=True, ... ) (384, 483) """ if template is None: if by is None: msg = "must specify either 'by' or 'template'" raise ValueError(msg) else: if callable(template) and to_str is not None: msg = "cannot provide 'to_str' when a template function is provided" raise ValueError(msg) if by is not None: msg = "cannot provide 'by' when a custom template is provided" raise ValueError(msg) to_str = to_str if to_str is not None else str if template is None: def map_template(item: T, /) -> str: return _DEFAULT_TEMPLATE.format(by=by or "", item=to_str(item)) elif isinstance(template, str): def map_template(item: T, /) -> str: return template.format(to_str(item)) else: # callable map_template = template model = model if model is not None else self._model async def fn(item: T) -> tuple[T, bool]: decision = await self.prompt( map_template(item), return_type=_Decision, model=model, ) if negate: decision.decision = not decision.decision return item, decision.decision tasks: list[asyncio.Task[tuple[T, bool]]] = [asyncio.create_task(fn(item)) for item in iterable] try: for next_finished in asyncio.as_completed(tasks): item, decision = await next_finished if decision: return item return None finally: for task in tasks: if not task.done(): task.cancel() await asyncio.wait(tasks) ``` ### map ```python map[T, U: BaseModel]( iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: type[U], model: str | None = None, ) -> list[U] ``` ```python map[T, U]( iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: Bare[U], model: str | None = None, ) -> list[U] ``` ```python map[T]( iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: None = None, model: str | None = None, ) -> list[str] ``` Map a prompt template over an iterable and get responses from the language model. This method applies a prompt template to each item in the provided iterable, sends the resulting prompts to the language model, and collects the responses. The responses can be returned as raw strings, parsed into [Pydantic](https://pydantic.dev/) models, or extracted as bare values using the Bare marker. This method is analogous to Python's built-in [`map`](https://docs.python.org/3/library/functions.html#map) function. Parameters: | Name | Type | Description | Default | | ------------- | ------------- | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `iterable` | `Iterable[T]` | The collection of items to map over. | *required* | | `template` | \`str | Callable\[[T], str\]\` | A prompt template to apply to each item. This can be either a string template with a single positional placeholder, or a callable that takes an item and returns a formatted string. | | `return_type` | \`type[U] | Bare[V] | None\` | | `model` | \`str | None\` | If specified, overrides the default model for this call. | Returns: | Type | Description | | --------- | ----------- | | \`list[U] | list[V] | Raises: | Type | Description | | ----------------- | ------------------------------------------------------------------------------------------------------------ | | `ValidationError` | If return_type is a Pydantic model or a Bare type and the response cannot be parsed into the specified type. | Examples: Basic map: ```pycon >>> await session.map( ... ["apple", "banana", "kiwi"], ... template="What color is {}? Reply in a single word.", ... ) ['Red.', 'Yellow.', 'Green.'] ``` Map with structured return type: ```pycon >>> class Person(pydantic.BaseModel): ... name: str ... age: int >>> await session.map( ... ["Barack Obama", "Angela Merkel"], ... template="Who is {}?", ... return_type=Person, ... ) [Person(name='Barack Obama', age=62), Person(name='Angela Merkel', age=69)] ``` Map with bare return type: ```pycon >>> await session.map( ... [42, 1337, 2025], ... template="What are the unique prime factors of {}?", ... return_type=Bare(list[int]), ... ) [[2, 3, 7], [7, 191], [3, 5]] ``` Source code in `src/semlib/map.py` ```python async def map[T, U: BaseModel, V]( self, iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: type[U] | Bare[V] | None = None, model: str | None = None, ) -> list[U] | list[V] | list[str]: """Map a prompt template over an iterable and get responses from the language model. This method applies a prompt template to each item in the provided iterable, sends the resulting prompts to the language model, and collects the responses. The responses can be returned as raw strings, parsed into [Pydantic](https://pydantic.dev/) models, or extracted as bare values using the [Bare][semlib.bare.Bare] marker. This method is analogous to Python's built-in [`map`](https://docs.python.org/3/library/functions.html#map) function. Args: iterable: The collection of items to map over. template: A prompt template to apply to each item. This can be either a string template with a single positional placeholder, or a callable that takes an item and returns a formatted string. return_type: If not specified, the responses are returned as raw strings. If a Pydantic model class is provided, the responses are parsed into instances of that model. If a [Bare][semlib.bare.Bare] instance is provided, single values of the specified type are extracted from the responses. model: If specified, overrides the default model for this call. Returns: A list of responses from the language model in the format specified by return_type. Raises: ValidationError: If `return_type` is a Pydantic model or a Bare type and the response cannot be parsed into the specified type. Examples: Basic map: >>> await session.map( ... ["apple", "banana", "kiwi"], ... template="What color is {}? Reply in a single word.", ... ) ['Red.', 'Yellow.', 'Green.'] Map with structured return type: >>> class Person(pydantic.BaseModel): ... name: str ... age: int >>> await session.map( ... ["Barack Obama", "Angela Merkel"], ... template="Who is {}?", ... return_type=Person, ... ) [Person(name='Barack Obama', age=62), Person(name='Angela Merkel', age=69)] Map with bare return type: >>> await session.map( ... [42, 1337, 2025], ... template="What are the unique prime factors of {}?", ... return_type=Bare(list[int]), ... ) [[2, 3, 7], [7, 191], [3, 5]] """ formatter = template.format if isinstance(template, str) else template model = model if model is not None else self._model # case analysis for type checker if return_type is None: return await util.gather(*[self.prompt(formatter(item), model=model) for item in iterable]) if isinstance(return_type, Bare): return await util.gather( *[self.prompt(formatter(item), return_type=return_type, model=model) for item in iterable] ) return await util.gather( *[self.prompt(formatter(item), return_type=return_type, model=model) for item in iterable] ) ``` ### max ```python max[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, ) -> T ``` Get the largest item in an iterable. This method finds the largest item in a collection by using a language model to perform pairwise comparisons. This method is analogous to Python's built-in [`max`](https://docs.python.org/3/library/functions.html#max) function. Parameters: | Name | Type | Description | Default | | ---------- | ---------------------- | ---------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `iterable` | `Iterable[T]` | The collection of items to search. | *required* | | `by` | \`str | None\` | A criteria specifying what aspect to compare by. If this is provided, template cannot be provided. | | `to_str` | \`Callable\[[T], str\] | None\` | If specified, used to convert items to string representation. Otherewise, uses str() on each item. If this is provided, a callable template cannot be provided. | | `template` | \`str | Callable\[[T, T], str\] | None\` | | `task` | \`Task | str | None\` | | `model` | \`str | None\` | If specified, overrides the default model for this call. | Returns: | Type | Description | | ---- | ----------------- | | `T` | The largest item. | Raises: | Type | Description | | ----------------- | ---------------------------------- | | `ValidationError` | If parsing any LLM response fails. | Examples: Basic usage: ```pycon >>> await session.max( ... ["LeBron James", "Kobe Bryant", "Magic Johnson"], by="assists" ... ) 'Magic Johnson' ``` Source code in `src/semlib/extrema.py` ```python async def max[T]( self, iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, ) -> T: """Get the largest item in an iterable. This method finds the largest item in a collection by using a language model to perform pairwise comparisons. This method is analogous to Python's built-in [`max`](https://docs.python.org/3/library/functions.html#max) function. Args: iterable: The collection of items to search. by: A criteria specifying what aspect to compare by. If this is provided, `template` cannot be provided. to_str: If specified, used to convert items to string representation. Otherewise, uses `str()` on each item. If this is provided, a callable template cannot be provided. template: A custom prompt template for comparisons. Must be either a string template with two positional placeholders, or a callable that takes two items and returns a formatted string. If this is provided, `by` cannot be provided. task: The type of comparison task that is being performed in `template`. This allows for writing the template in the most convenient way possible (e.g., in some scenarios, it's easier to specify a criteria for which item is lesser, and in others, it's easier to specify a criteria for which item is greater). If this is provided, a custom `template` must also be provided. Defaults to [Task.CHOOSE_GREATER][semlib.compare.Task.CHOOSE_GREATER] if not specified. model: If specified, overrides the default model for this call. Returns: The largest item. Raises: ValidationError: If parsing any LLM response fails. Examples: Basic usage: >>> await session.max( ... ["LeBron James", "Kobe Bryant", "Magic Johnson"], by="assists" ... ) 'Magic Johnson' """ return await self._get_extreme( list(iterable), find_min=False, by=by, to_str=to_str, template=template, task=task, model=model, ) ``` ### min ```python min[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, ) -> T ``` Get the smallest item in an iterable. This method finds the smallest item in a collection by using a language model to perform pairwise comparisons. This method is analogous to Python's built-in [`min`](https://docs.python.org/3/library/functions.html#min) function. Parameters: | Name | Type | Description | Default | | ---------- | ---------------------- | ---------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `iterable` | `Iterable[T]` | The collection of items to search. | *required* | | `by` | \`str | None\` | A criteria specifying what aspect to compare by. If this is provided, template cannot be provided. | | `to_str` | \`Callable\[[T], str\] | None\` | If specified, used to convert items to string representation. Otherewise, uses str() on each item. If this is provided, a callable template cannot be provided. | | `template` | \`str | Callable\[[T, T], str\] | None\` | | `task` | \`Task | str | None\` | | `model` | \`str | None\` | If specified, overrides the default model for this call. | Returns: | Type | Description | | ---- | ------------------ | | `T` | The smallest item. | Raises: | Type | Description | | ----------------- | ---------------------------------- | | `ValidationError` | If parsing any LLM response fails. | Examples: Basic usage: ```pycon >>> await session.min(["blue", "red", "green"], by="wavelength") 'blue' ``` Source code in `src/semlib/extrema.py` ```python async def min[T]( self, iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, ) -> T: """Get the smallest item in an iterable. This method finds the smallest item in a collection by using a language model to perform pairwise comparisons. This method is analogous to Python's built-in [`min`](https://docs.python.org/3/library/functions.html#min) function. Args: iterable: The collection of items to search. by: A criteria specifying what aspect to compare by. If this is provided, `template` cannot be provided. to_str: If specified, used to convert items to string representation. Otherewise, uses `str()` on each item. If this is provided, a callable template cannot be provided. template: A custom prompt template for comparisons. Must be either a string template with two positional placeholders, or a callable that takes two items and returns a formatted string. If this is provided, `by` cannot be provided. task: The type of comparison task that is being performed in `template`. This allows for writing the template in the most convenient way possible (e.g., in some scenarios, it's easier to specify a criteria for which item is lesser, and in others, it's easier to specify a criteria for which item is greater). If this is provided, a custom `template` must also be provided. Defaults to [Task.CHOOSE_GREATER][semlib.compare.Task.CHOOSE_GREATER] if not specified. model: If specified, overrides the default model for this call. Returns: The smallest item. Raises: ValidationError: If parsing any LLM response fails. Examples: Basic usage: >>> await session.min(["blue", "red", "green"], by="wavelength") 'blue' """ return await self._get_extreme( list(iterable), find_min=True, by=by, to_str=to_str, template=template, task=task, model=model, ) ``` ### prompt ```python prompt[T: BaseModel]( prompt: str, /, *, return_type: type[T], model: str | None = None, ) -> T ``` ```python prompt[T]( prompt: str, /, *, return_type: Bare[T], model: str | None = None, ) -> T ``` ```python prompt( prompt: str, /, *, return_type: None = None, model: str | None = None, ) -> str ``` Send a prompt to the language model and get a response. This method sends a single user message to the language model and returns the response. The response can be returned as a raw string, parsed into a [Pydantic](https://pydantic.dev/) model, or extracted as a bare value using the Bare marker. Parameters: | Name | Type | Description | Default | | ------------- | --------- | ---------------------------------------------- | -------------------------------------------------------- | | `prompt` | `str` | The text prompt to send to the language model. | *required* | | `return_type` | \`type[T] | Bare[U] | None\` | | `model` | \`str | None\` | If specified, overrides the default model for this call. | Returns: | Type | Description | | ----- | ----------- | | \`str | T | Raises: | Type | Description | | ----------------- | ------------------------------------------------------------------------------------------------------------ | | `ValidationError` | If return_type is a Pydantic model or a Bare type and the response cannot be parsed into the specified type. | Examples: Get raw string response: ```pycon >>> await session.prompt("What is 2+2?") '2 + 2 equals 4.' ``` Get structured value: ```pycon >>> class Person(BaseModel): ... name: str ... age: int >>> await session.prompt("Who is Barack Obama?", return_type=Person) Person(name='Barack Obama', age=62) ``` Get bare value: ```pycon >>> await session.prompt("What is 2+2?", return_type=Bare(int)) 4 ``` Source code in `src/semlib/_internal/base.py` ```python async def prompt[T: BaseModel, U]( self, prompt: str, /, *, return_type: type[T] | Bare[U] | None = None, model: str | None = None ) -> str | T | U: """Send a prompt to the language model and get a response. This method sends a single user message to the language model and returns the response. The response can be returned as a raw string, parsed into a [Pydantic](https://pydantic.dev/) model, or extracted as a bare value using the [Bare][semlib.bare.Bare] marker. Args: prompt: The text prompt to send to the language model. return_type: If not specified, the response is returned as a raw string. If a Pydantic model class is provided, the response is parsed into an instance of that model. If a [Bare][semlib.bare.Bare] instance is provided, a single value of the specified type is extracted from the response. model: If specified, overrides the default model for this call. Returns: The language model's response in the format specified by return_type. Raises: ValidationError: If `return_type` is a Pydantic model or a Bare type and the response cannot be parsed into the specified type. Examples: Get raw string response: >>> await session.prompt("What is 2+2?") '2 + 2 equals 4.' Get structured value: >>> class Person(BaseModel): ... name: str ... age: int >>> await session.prompt("Who is Barack Obama?", return_type=Person) Person(name='Barack Obama', age=62) Get bare value: >>> await session.prompt("What is 2+2?", return_type=Bare(int)) 4 """ return await self._acompletion( messages=[Message(role="user", content=prompt)], return_type=return_type, model=model ) ``` ### reduce ```python reduce( iterable: Iterable[str], /, template: str | Callable[[str, str], str], *, associative: bool = False, model: str | None = None, ) -> str ``` ```python reduce[T]( iterable: Iterable[str | T], /, template: str | Callable[[str | T, str | T], str], *, associative: bool = False, model: str | None = None, ) -> str | T ``` ```python reduce[T: BaseModel]( iterable: Iterable[T], /, template: str | Callable[[T, T], str], *, return_type: type[T], associative: bool = False, model: str | None = None, ) -> T ``` ```python reduce[T]( iterable: Iterable[T], /, template: str | Callable[[T, T], str], *, return_type: Bare[T], associative: bool = False, model: str | None = None, ) -> T ``` ```python reduce[T, U: BaseModel]( iterable: Iterable[T], /, template: str | Callable[[U, T], str], initial: U, *, return_type: type[U], model: str | None = None, ) -> U ``` ```python reduce[T, U]( iterable: Iterable[T], /, template: str | Callable[[U, T], str], initial: U, *, return_type: Bare[U], model: str | None = None, ) -> U ``` Reduce an iterable to a single value using a language model. This method is analogous to Python's [`functools.reduce`](https://docs.python.org/3/library/functools.html#functools.reduce) function. Parameters: | Name | Type | Description | Default | | ------------- | --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `iterable` | `Iterable[Any]` | The collection of items to reduce. | *required* | | `template` | \`str | Callable\[[Any, Any], str\]\` | A prompt template to apply to each item. This can be either a string template with two positional placeholders (with the first placeholder being the accumulator and the second placeholder being an item), or a callable that takes an accumulator and an item and returns a formatted string. | | `initial` | `Any` | If provided, this value is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty. | `None` | | `return_type` | `Any` | The return type is also the type of the accumulator. If not specified, the responses are returned as raw strings. If a Pydantic model class is provided, the responses are parsed into instances of that model. If a Bare instance is provided, single values of the specified type are extracted from the responses. | `None` | | `associative` | `bool` | If True, the reduction is performed in a balanced tree manner, which unlocks concurrency and can provide significant speedups for large iterables. This requires the reduction operation to be associative. | `False` | | `model` | \`str | None\` | If specified, overrides the default model for this call. | Returns: | Type | Description | | ----- | ---------------------------- | | `Any` | The final accumulated value. | Raises: | Type | Description | | ----------------- | ------------------------------------------------------------------------------------------------------------ | | `ValidationError` | If return_type is a Pydantic model or a Bare type and the response cannot be parsed into the specified type. | Examples: Basic reduce: ```pycon >>> await session.reduce( ... ["one", "three", "seven", "twelve"], "{} + {} = ?", return_type=Bare(int) ... ) 23 ``` Reduce with initial value: ```pycon >>> await session.reduce( ... range(20), ... template=lambda acc, n: f"If {n} is prime, append it to this list: {acc}.", ... initial=[], ... return_type=Bare(list[int]), ... model="openai/o4-mini", ... ) [2, 3, 5, 7, 11, 13, 17, 19] ``` Associative reduce: ```pycon >>> await session.reduce( ... [[i] for i in range(20)], ... template=lambda acc, ... n: f"Compute the union of these two sets, and then remove any non-prime numbers: {acc} and {n}. Return the result as a list.", ... return_type=Bare(list[int]), ... associative=True, ... model="openai/o4-mini", ... ) [2, 3, 5, 7, 11, 13, 17, 19] ``` Distinguishing between leaf nodes and internal nodes in an associative reduce with Box: ```pycon >>> reviews: list[str] = [ ... "The instructions are a bit confusing. It took me a while to figure out how to use it.", ... "It's so loud!", ... "I regret buying this microwave. It's the worst appliance I've ever owned.", ... "This microwave is great! It heats up food quickly and evenly.", ... "This microwave is a waste of money. It doesn't work at all.", ... "I hate the design of this microwave. It looks cheap and ugly.", ... "The turntable is a bit small, so I can't fit larger plates in it.", ... "The microwave is a bit noisy when it's running.", ... "The microwave is a bit expensive compared to other models with similar features.", ... "The turntable is useless, so I can't fit any plates in it.", ... "I love the sleek design of this microwave. It looks great in my kitchen.", ... ] >>> def template(a: str | Box[str], b: str | Box[str]) -> str: ... # leaf nodes (raw reviews) ... if isinstance(a, Box) and isinstance(b, Box): ... return f''' ... Consider the following two product reviews, and return a bulleted list ... summarizing any actionable product improvements that could be made based on ... the reviews. If there are no actionable product improvements, return an empty ... string. ... ... - Review 1: {a.value} ... - Review 2: {b.value}''' ... # summaries of reviews ... if not isinstance(a, Box) and not isinstance(b, Box): ... return f''' ... Consider the following two lists of ideas for product improvements, and ... combine them while de-duplicating similar ideas. If there are no ideas, return ... an empty string. ... ... # List 1: ... {a} ... ... # List 2: ... {b}''' ... # one is a summary, the other is a raw review ... if isinstance(a, Box) and not isinstance(b, Box): ... ideas = b ... review = a.value ... if not isinstance(a, Box) and isinstance(b, Box): ... ideas = a ... review = b.value ... return f''' ... Consider the following list of ideas for product improvements, and a product ... review. Update the list of ideas based on the review, de-duplicating similar ... ideas. If there are no ideas, return an empty string. ... ... # List of ideas: ... {ideas} ... ... # Review: ... {review}''' >>> result = await session.reduce( ... map(Box, reviews), template=template, associative=True ... ) >>> print(result) - Clarify and simplify the product instructions to make them easier to understand. - Consider reducing the noise level of the product to make it quieter during operation. - Improve product reliability to ensure the microwave functions correctly for all users. - Increase the size or adjust the design of the turntable to accommodate larger plates. - Improve the design to enhance the aesthetic appeal and make it look more premium. ``` Source code in `src/semlib/reduce.py` ```python async def reduce( self, iterable: Iterable[Any], /, template: str | Callable[[Any, Any], str], initial: Any = None, *, return_type: Any = None, associative: bool = False, model: str | None = None, ) -> Any: """Reduce an iterable to a single value using a language model. This method is analogous to Python's [`functools.reduce`](https://docs.python.org/3/library/functools.html#functools.reduce) function. Args: iterable: The collection of items to reduce. template: A prompt template to apply to each item. This can be either a string template with two positional placeholders (with the first placeholder being the accumulator and the second placeholder being an item), or a callable that takes an accumulator and an item and returns a formatted string. initial: If provided, this value is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty. return_type: The return type is also the type of the accumulator. If not specified, the responses are returned as raw strings. If a Pydantic model class is provided, the responses are parsed into instances of that model. If a [Bare][semlib.bare.Bare] instance is provided, single values of the specified type are extracted from the responses. associative: If `True`, the reduction is performed in a balanced tree manner, which unlocks concurrency and can provide significant speedups for large iterables. This requires the reduction operation to be associative. model: If specified, overrides the default model for this call. Returns: The final accumulated value. Raises: ValidationError: If `return_type` is a Pydantic model or a Bare type and the response cannot be parsed into the specified type. Examples: Basic reduce: >>> await session.reduce( ... ["one", "three", "seven", "twelve"], "{} + {} = ?", return_type=Bare(int) ... ) 23 Reduce with initial value: >>> await session.reduce( ... range(20), ... template=lambda acc, n: f"If {n} is prime, append it to this list: {acc}.", ... initial=[], ... return_type=Bare(list[int]), ... model="openai/o4-mini", ... ) [2, 3, 5, 7, 11, 13, 17, 19] Associative reduce: >>> await session.reduce( ... [[i] for i in range(20)], ... template=lambda acc, ... n: f"Compute the union of these two sets, and then remove any non-prime numbers: {acc} and {n}. Return the result as a list.", ... return_type=Bare(list[int]), ... associative=True, ... model="openai/o4-mini", ... ) [2, 3, 5, 7, 11, 13, 17, 19] Distinguishing between leaf nodes and internal nodes in an associative reduce with [Box][semlib.box.Box]: >>> reviews: list[str] = [ ... "The instructions are a bit confusing. It took me a while to figure out how to use it.", ... "It's so loud!", ... "I regret buying this microwave. It's the worst appliance I've ever owned.", ... "This microwave is great! It heats up food quickly and evenly.", ... "This microwave is a waste of money. It doesn't work at all.", ... "I hate the design of this microwave. It looks cheap and ugly.", ... "The turntable is a bit small, so I can't fit larger plates in it.", ... "The microwave is a bit noisy when it's running.", ... "The microwave is a bit expensive compared to other models with similar features.", ... "The turntable is useless, so I can't fit any plates in it.", ... "I love the sleek design of this microwave. It looks great in my kitchen.", ... ] >>> def template(a: str | Box[str], b: str | Box[str]) -> str: ... # leaf nodes (raw reviews) ... if isinstance(a, Box) and isinstance(b, Box): ... return f''' ... Consider the following two product reviews, and return a bulleted list ... summarizing any actionable product improvements that could be made based on ... the reviews. If there are no actionable product improvements, return an empty ... string. ... ... - Review 1: {a.value} ... - Review 2: {b.value}''' ... # summaries of reviews ... if not isinstance(a, Box) and not isinstance(b, Box): ... return f''' ... Consider the following two lists of ideas for product improvements, and ... combine them while de-duplicating similar ideas. If there are no ideas, return ... an empty string. ... ... # List 1: ... {a} ... ... # List 2: ... {b}''' ... # one is a summary, the other is a raw review ... if isinstance(a, Box) and not isinstance(b, Box): ... ideas = b ... review = a.value ... if not isinstance(a, Box) and isinstance(b, Box): ... ideas = a ... review = b.value ... return f''' ... Consider the following list of ideas for product improvements, and a product ... review. Update the list of ideas based on the review, de-duplicating similar ... ideas. If there are no ideas, return an empty string. ... ... # List of ideas: ... {ideas} ... ... # Review: ... {review}''' >>> result = await session.reduce( ... map(Box, reviews), template=template, associative=True ... ) >>> print(result) - Clarify and simplify the product instructions to make them easier to understand. - Consider reducing the noise level of the product to make it quieter during operation. - Improve product reliability to ensure the microwave functions correctly for all users. - Increase the size or adjust the design of the turntable to accommodate larger plates. - Improve the design to enhance the aesthetic appeal and make it look more premium. """ if initial is not None: return await self._reduce2( iterable, template, initial, return_type=return_type, model=model, ) return await self._reduce1( iterable, template, return_type=return_type, associative=associative, model=model, ) ``` ### sort ```python sort[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, algorithm: Algorithm | None = None, reverse: bool = False, model: str | None = None, ) -> list[T] ``` Sort an iterable. This method sorts a collection of items by using a language model to perform pairwise comparisons. The sorting algorithm determines which comparisons to make and how to aggregate the results into a final ranking. This method is analogous to Python's built-in [`sorted`](https://docs.python.org/3/library/functions.html#sorted) function. Parameters: | Name | Type | Description | Default | | ----------- | ---------------------- | --------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `iterable` | `Iterable[T]` | The collection of items to sort. | *required* | | `by` | \`str | None\` | A criteria specifying what aspect to compare by. If this is provided, template cannot be provided. | | `to_str` | \`Callable\[[T], str\] | None\` | If specified, used to convert items to string representation. Otherewise, uses str() on each item. If this is provided, a callable template cannot be provided. | | `template` | \`str | Callable\[[T, T], str\] | None\` | | `task` | \`Task | str | None\` | | `algorithm` | \`Algorithm | None\` | The sorting algorithm to use. If not specified, defaults to BordaCount. Different algorithms make different tradeoffs between accuracy, latency, and cost. See the documentation for each algorithm for details. | | `reverse` | `bool` | If True, sort in descending order. If False, sort in ascending order. | `False` | | `model` | \`str | None\` | If specified, overrides the default model for this call. | Returns: | Type | Description | | --------- | ---------------------------------------------------------------- | | `list[T]` | A new list containing all items from the iterable in sort order. | Raises: | Type | Description | | ----------------- | ---------------------------------- | | `ValidationError` | If parsing any LLM response fails. | Examples: Basic sort: ```pycon >>> await session.sort(["blue", "red", "green"], by="wavelength", reverse=True) ['red', 'green', 'blue'] ``` Custom template and task: ```pycon >>> from dataclasses import dataclass >>> @dataclass ... class Person: ... name: str ... job: str >>> people = [ ... Person(name="Barack Obama", job="President of the United States"), ... Person(name="Dalai Lama", job="Spiritual Leader of Tibet"), ... Person(name="Sundar Pichai", job="CEO of Google"), ... ] >>> await session.sort( ... people, ... template=lambda a, b: f"Which job earns more, (a) {a.job} or (b) {b.job}?", ... ) [ Person(name='Dalai Lama', job='Spiritual Leader of Tibet'), Person(name='Barack Obama', job='President of the United States'), Person(name='Sundar Pichai', job='CEO of Google'), ] ``` Source code in `src/semlib/sort/sort.py` ```python async def sort[T]( self, iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, algorithm: Algorithm | None = None, reverse: bool = False, model: str | None = None, ) -> list[T]: """Sort an iterable. This method sorts a collection of items by using a language model to perform pairwise comparisons. The sorting algorithm determines which comparisons to make and how to aggregate the results into a final ranking. This method is analogous to Python's built-in [`sorted`](https://docs.python.org/3/library/functions.html#sorted) function. Args: iterable: The collection of items to sort. by: A criteria specifying what aspect to compare by. If this is provided, `template` cannot be provided. to_str: If specified, used to convert items to string representation. Otherewise, uses `str()` on each item. If this is provided, a callable template cannot be provided. template: A custom prompt template for comparisons. Must be either a string template with two positional placeholders, or a callable that takes two items and returns a formatted string. If this is provided, `by` cannot be provided. task: The type of comparison task that is being performed in `template`. This allows for writing the template in the most convenient way possible (e.g., in some scenarios, it's easier to specify a criteria for which item is lesser, and in others, it's easier to specify a criteria for which item is greater). If this is provided, a custom `template` must also be provided. Defaults to [Task.CHOOSE_GREATER][semlib.compare.Task.CHOOSE_GREATER] if not specified. algorithm: The sorting algorithm to use. If not specified, defaults to [BordaCount][semlib.sort.algorithm.BordaCount]. Different algorithms make different tradeoffs between accuracy, latency, and cost. See the documentation for each algorithm for details. reverse: If `True`, sort in descending order. If `False`, sort in ascending order. model: If specified, overrides the default model for this call. Returns: A new list containing all items from the iterable in sort order. Raises: ValidationError: If parsing any LLM response fails. Examples: Basic sort: >>> await session.sort(["blue", "red", "green"], by="wavelength", reverse=True) ['red', 'green', 'blue'] Custom template and task: >>> from dataclasses import dataclass >>> @dataclass ... class Person: ... name: str ... job: str >>> people = [ ... Person(name="Barack Obama", job="President of the United States"), ... Person(name="Dalai Lama", job="Spiritual Leader of Tibet"), ... Person(name="Sundar Pichai", job="CEO of Google"), ... ] >>> await session.sort( ... people, ... template=lambda a, b: f"Which job earns more, (a) {a.job} or (b) {b.job}?", ... ) [ Person(name='Dalai Lama', job='Spiritual Leader of Tibet'), Person(name='Barack Obama', job='President of the United States'), Person(name='Sundar Pichai', job='CEO of Google'), ] """ algorithm = algorithm if algorithm is not None else BordaCount() async def comparator(a: T, b: T) -> Order: return await self.compare(a, b, by=by, to_str=to_str, template=template, task=task, model=model) return await algorithm._sort( # noqa: SLF001 iterable, reverse=reverse, comparator=comparator, max_concurrency=self._max_concurrency ) ``` ### total_cost ```python total_cost() -> float ``` Get the total cost incurred so far for API calls made through this instance. Returns: | Type | Description | | ------- | ---------------------- | | `float` | The total cost in USD. | Source code in `src/semlib/_internal/base.py` ```python def total_cost(self) -> float: """Get the total cost incurred so far for API calls made through this instance. Returns: The total cost in USD. """ return self._total_cost ``` ## QueryCache Bases: `ABC` Abstract base class for a cache of LLM query results. Caches can be used with Session to avoid repeating identical queries to the LLM. Source code in `src/semlib/cache.py` ```python class QueryCache(ABC): """Abstract base class for a cache of LLM query results. Caches can be used with [Session][semlib.session.Session] to avoid repeating identical queries to the LLM. """ @abstractmethod def _set[T: BaseModel](self, key: CacheKey[T], value: str) -> None: ... @abstractmethod def _get[T: BaseModel](self, key: CacheKey[T]) -> str | None: ... @abstractmethod def clear(self) -> None: ... @abstractmethod def __len__(self) -> int: ... def _hash_key[T: BaseModel](self, key: CacheKey[T]) -> bytes: messages, pydantic_model, llm_model = key key_components: list[str] = [llm_model] key_components.extend(message.to_json() for message in messages) if pydantic_model is not None: key_components.append(json.dumps(pydantic_model.model_json_schema())) h = sha256() for part in key_components: h.update(part.encode("utf-8")) return h.digest() ``` ## OnDiskCache Bases: `QueryCache` A persistent on-disk cache of LLM query results, backed by SQLite. Source code in `src/semlib/cache.py` ```python class OnDiskCache(QueryCache): """A persistent on-disk cache of LLM query results, backed by SQLite.""" @override def __init__(self, path: str) -> None: """Initialize an on-disk cache. Args: path: Path to the SQLite database file. If the file does not exist, it will be created. By convention, the filename should have a ".db" or ".sqlite" extension. """ self._conn = sqlite3.connect(path, autocommit=True) self._conn.execute( """ CREATE TABLE IF NOT EXISTS metadata ( key TEXT PRIMARY KEY, value TEXT NOT NULL ) """ ) cur = self._conn.execute("SELECT value FROM metadata WHERE key = ?", (_VERSION_KEY,)) row = cur.fetchone() if row is None: self._conn.execute("INSERT INTO metadata (key, value) VALUES (?, ?)", (_VERSION_KEY, _VERSION)) elif row[0] != _VERSION: msg = f"cache version mismatch: expected {_VERSION}, got {row[0]}" raise ValueError(msg) self._conn.execute( """ CREATE TABLE IF NOT EXISTS data ( key BLOB PRIMARY KEY, value TEXT NOT NULL ) """ ) @override def _set[T: BaseModel](self, key: CacheKey[T], value: str) -> None: self._conn.execute( "INSERT OR REPLACE INTO data (key, value) VALUES (?, ?)", (self._hash_key(key), value), ) @override def _get[T: BaseModel](self, key: CacheKey[T]) -> str | None: cur = self._conn.execute("SELECT value FROM data WHERE key = ?", (self._hash_key(key),)) row = cur.fetchone() if row is None: return None return cast("str", row[0]) @override def clear(self) -> None: self._conn.execute("DELETE FROM data") @override def __len__(self) -> int: cur = self._conn.execute("SELECT COUNT(*) FROM data") row = cur.fetchone() return cast("int", row[0]) ``` ### __init__ ```python __init__(path: str) -> None ``` Initialize an on-disk cache. Parameters: | Name | Type | Description | Default | | ------ | ----- | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | | `path` | `str` | Path to the SQLite database file. If the file does not exist, it will be created. By convention, the filename should have a ".db" or ".sqlite" extension. | *required* | Source code in `src/semlib/cache.py` ```python @override def __init__(self, path: str) -> None: """Initialize an on-disk cache. Args: path: Path to the SQLite database file. If the file does not exist, it will be created. By convention, the filename should have a ".db" or ".sqlite" extension. """ self._conn = sqlite3.connect(path, autocommit=True) self._conn.execute( """ CREATE TABLE IF NOT EXISTS metadata ( key TEXT PRIMARY KEY, value TEXT NOT NULL ) """ ) cur = self._conn.execute("SELECT value FROM metadata WHERE key = ?", (_VERSION_KEY,)) row = cur.fetchone() if row is None: self._conn.execute("INSERT INTO metadata (key, value) VALUES (?, ?)", (_VERSION_KEY, _VERSION)) elif row[0] != _VERSION: msg = f"cache version mismatch: expected {_VERSION}, got {row[0]}" raise ValueError(msg) self._conn.execute( """ CREATE TABLE IF NOT EXISTS data ( key BLOB PRIMARY KEY, value TEXT NOT NULL ) """ ) ``` ## InMemoryCache Bases: `QueryCache` An in-memory cache of LLM query results. Source code in `src/semlib/cache.py` ```python class InMemoryCache(QueryCache): """An in-memory cache of LLM query results.""" @override def __init__(self) -> None: """Initialize an in-memory cache.""" self._data: dict[bytes, str] = {} @override def _set[T: BaseModel](self, key: CacheKey[T], value: str) -> None: self._data[self._hash_key(key)] = value @override def _get[T: BaseModel](self, key: CacheKey[T]) -> str | None: return self._data.get(self._hash_key(key)) @override def clear(self) -> None: self._data.clear() @override def __len__(self) -> int: return len(self._data) ``` ### __init__ ```python __init__() -> None ``` Initialize an in-memory cache. Source code in `src/semlib/cache.py` ```python @override def __init__(self) -> None: """Initialize an in-memory cache.""" self._data: dict[bytes, str] = {} ``` ## Bare A marker to indicate that a function should return a bare value of type `T`. This can be passed to the `return_type` parameter of functions like prompt. For situations where you want to extract a single value of a given base type (like `int` or `list[float]`), this is more convenient than the alternative of defining a Pydantic model with a single field for the purpose of extracting that value. Examples: Extract a bare value using prompt: ```pycon >>> await session.prompt("What is 2+2?", return_type=Bare(int)) 4 ``` Influence model output using `class_name` and `field_name`: ```pycon >>> await session.prompt( ... "Give me a list", ... return_type=Bare( ... list[int], class_name="list_of_three_values", field_name="primes" ... ), ... ) [3, 7, 11] ``` Source code in `src/semlib/bare.py` ```python class Bare[T]: """A marker to indicate that a function should return a bare value of type `T`. This can be passed to the `return_type` parameter of functions like [prompt][semlib._internal.base.Base.prompt]. For situations where you want to extract a single value of a given base type (like `int` or `list[float]`), this is more convenient than the alternative of defining a Pydantic model with a single field for the purpose of extracting that value. Examples: Extract a bare value using prompt: >>> await session.prompt("What is 2+2?", return_type=Bare(int)) 4 Influence model output using `class_name` and `field_name`: >>> await session.prompt( ... "Give me a list", ... return_type=Bare( ... list[int], class_name="list_of_three_values", field_name="primes" ... ), ... ) [3, 7, 11] """ def __init__(self, typ: type[T], /, class_name: str | None = None, field_name: str | None = None): """Initialize a Bare instance. Args: typ: The type of the bare value to extract. class_name: Name for a dynamically created Pydantic model class. If not provided, defaults to the name of `typ`. This name is visible to the LLM and may affect model output. field_name: Name for the field in the dynamically created Pydantic model that holds the bare value. If not provided, defaults to "value". This name is visible to the LLM and may affect model output. """ self._typ = typ self._class_name = class_name if class_name is not None else typ.__name__ self._field_name = field_name if field_name is not None else "value" field_definitions: Any = {self._field_name: (self._typ, ...)} self._model: type[pydantic.BaseModel] = pydantic.create_model(self._class_name, **field_definitions) def _extract(self, obj: Any) -> T: if isinstance(obj, self._model): return cast("T", getattr(obj, self._field_name)) msg = f"expected instance of {self._model.__name__}, got {type(obj).__name__}" raise TypeError(msg) ``` ### __init__ ```python __init__( typ: type[T], /, class_name: str | None = None, field_name: str | None = None, ) ``` Initialize a Bare instance. Parameters: | Name | Type | Description | Default | | ------------ | --------- | -------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `typ` | `type[T]` | The type of the bare value to extract. | *required* | | `class_name` | \`str | None\` | Name for a dynamically created Pydantic model class. If not provided, defaults to the name of typ. This name is visible to the LLM and may affect model output. | | `field_name` | \`str | None\` | Name for the field in the dynamically created Pydantic model that holds the bare value. If not provided, defaults to "value". This name is visible to the LLM and may affect model output. | Source code in `src/semlib/bare.py` ```python def __init__(self, typ: type[T], /, class_name: str | None = None, field_name: str | None = None): """Initialize a Bare instance. Args: typ: The type of the bare value to extract. class_name: Name for a dynamically created Pydantic model class. If not provided, defaults to the name of `typ`. This name is visible to the LLM and may affect model output. field_name: Name for the field in the dynamically created Pydantic model that holds the bare value. If not provided, defaults to "value". This name is visible to the LLM and may affect model output. """ self._typ = typ self._class_name = class_name if class_name is not None else typ.__name__ self._field_name = field_name if field_name is not None else "value" field_definitions: Any = {self._field_name: (self._typ, ...)} self._model: type[pydantic.BaseModel] = pydantic.create_model(self._class_name, **field_definitions) ``` ## Box A container that holds a value of type `T`. This can be used to tag values, so that they can be distinguished from other values of the same underlying type. Such a tag can be useful in the context of methods like reduce, where you can use this marker to distinguish leaf nodes from internal nodes in an associative reduce. Source code in `src/semlib/box.py` ```python class Box[T]: """A container that holds a value of type `T`. This can be used to tag values, so that they can be distinguished from other values of the same underlying type. Such a tag can be useful in the context of methods like [reduce][semlib.reduce.Reduce.reduce], where you can use this marker to distinguish leaf nodes from internal nodes in an associative reduce. """ def __init__(self, value: T) -> None: """Initialize a Box instance. Args: value: The value to be contained in the Box. """ self._value = value @property def value(self) -> T: """Get the value contained in the Box. Returns: The value contained in the Box. """ return self._value ``` ### value ```python value: T ``` Get the value contained in the Box. Returns: | Type | Description | | ---- | ------------------------------- | | `T` | The value contained in the Box. | ### __init__ ```python __init__(value: T) -> None ``` Initialize a Box instance. Parameters: | Name | Type | Description | Default | | ------- | ---- | ------------------------------------- | ---------- | | `value` | `T` | The value to be contained in the Box. | *required* | Source code in `src/semlib/box.py` ```python def __init__(self, value: T) -> None: """Initialize a Box instance. Args: value: The value to be contained in the Box. """ self._value = value ``` ::: # semlib.apply ## apply ```python apply[T, U: BaseModel]( item: T, /, template: str | Callable[[T], str], *, return_type: type[U], model: str | None = None, ) -> U ``` ```python apply[T, U]( item: T, /, template: str | Callable[[T], str], *, return_type: Bare[U], model: str | None = None, ) -> U ``` ```python apply[T]( item: T, /, template: str | Callable[[T], str], *, return_type: None = None, model: str | None = None, ) -> str ``` Standalone version of apply. Source code in `src/semlib/apply.py` ```python async def apply[T, U: BaseModel, V]( item: T, /, template: str | Callable[[T], str], *, return_type: type[U] | Bare[V] | None = None, model: str | None = None, ) -> U | V | str: """Standalone version of [apply][semlib.apply.Apply.apply].""" applier = Apply(model=model) return await applier.apply(item, template, return_type=return_type) ``` ## apply_sync ```python apply_sync[T, U: BaseModel]( item: T, /, template: str | Callable[[T], str], *, return_type: type[U], model: str | None = None, ) -> U ``` ```python apply_sync[T, U]( item: T, /, template: str | Callable[[T], str], *, return_type: Bare[U], model: str | None = None, ) -> U ``` ```python apply_sync[T]( item: T, /, template: str | Callable[[T], str], *, return_type: None = None, model: str | None = None, ) -> str ``` Standalone synchronous version of apply. Source code in `src/semlib/apply.py` ```python def apply_sync[T, U: BaseModel, V]( item: T, /, template: str | Callable[[T], str], *, return_type: type[U] | Bare[V] | None = None, model: str | None = None, ) -> U | V | str: """Standalone synchronous version of [apply][semlib.apply.Apply.apply].""" applier = Apply(model=model) return asyncio.run(applier.apply(item, template, return_type=return_type)) # type: ignore[return-value] ``` ::: # semlib.compare ## Task Bases: `str`, `Enum` Comparison task to perform. Intended to be passed to compare and similar methods, this specifies how the LLM should compare two items. Source code in `src/semlib/compare.py` ```python class Task(str, Enum): """Comparison task to perform. Intended to be passed to [compare][semlib.compare.Compare.compare] and similar methods, this specifies how the LLM should compare two items. """ COMPARE = "compare" """Ask the model to compare two items and determine their relative order. The model must choose either `"less"` or `"greater"`. """ COMPARE_OR_ABSTAIN = "compare_or_abstain" """Ask the model to compare two items and determine their relative order, or abstain if unsure. The model must choose `"less"`, `"greater"`, or `"neither"`. """ CHOOSE_GREATER = "choose_greater" """Ask the model to choose which of the two items (a) or (b) is greater. The model must choose either `"A"` or `"B"`. """ CHOOSE_GREATER_OR_ABSTAIN = "choose_greater_or_abstain" """Ask the model to choose which of the two items (a) or (b) is greater, or abstain if unsure. The model must choose `"A"`, `"B"`, or `"neither"`. """ CHOOSE_LESSER = "choose_lesser" """Ask the model to choose which of the two items (a) or (b) is lesser. The model must choose either `"A"` or `"B"`. """ CHOOSE_LESSER_OR_ABSTAIN = "choose_lesser_or_abstain" """Ask the model to choose which of the two items (a) or (b) is lesser, or abstain if unsure. The model must choose `"A"`, `"B"`, or `"neither"`. """ ``` ### CHOOSE_GREATER ```python CHOOSE_GREATER = 'choose_greater' ``` Ask the model to choose which of the two items (a) or (b) is greater. The model must choose either `"A"` or `"B"`. ### CHOOSE_GREATER_OR_ABSTAIN ```python CHOOSE_GREATER_OR_ABSTAIN = 'choose_greater_or_abstain' ``` Ask the model to choose which of the two items (a) or (b) is greater, or abstain if unsure. The model must choose `"A"`, `"B"`, or `"neither"`. ### CHOOSE_LESSER ```python CHOOSE_LESSER = 'choose_lesser' ``` Ask the model to choose which of the two items (a) or (b) is lesser. The model must choose either `"A"` or `"B"`. ### CHOOSE_LESSER_OR_ABSTAIN ```python CHOOSE_LESSER_OR_ABSTAIN = 'choose_lesser_or_abstain' ``` Ask the model to choose which of the two items (a) or (b) is lesser, or abstain if unsure. The model must choose `"A"`, `"B"`, or `"neither"`. ### COMPARE ```python COMPARE = 'compare' ``` Ask the model to compare two items and determine their relative order. The model must choose either `"less"` or `"greater"`. ### COMPARE_OR_ABSTAIN ```python COMPARE_OR_ABSTAIN = 'compare_or_abstain' ``` Ask the model to compare two items and determine their relative order, or abstain if unsure. The model must choose `"less"`, `"greater"`, or `"neither"`. ## Order Bases: `str`, `Enum` Result of a comparison. Source code in `src/semlib/compare.py` ```python class Order(str, Enum): """Result of a comparison.""" LESS = "less" """Item A is less than Item B.""" GREATER = "greater" """Item A is greater than Item B.""" NEITHER = "neither" """Items A and B are equivalent, or the model abstained from choosing.""" ``` ### GREATER ```python GREATER = 'greater' ``` Item A is greater than Item B. ### LESS ```python LESS = 'less' ``` Item A is less than Item B. ### NEITHER ```python NEITHER = 'neither' ``` Items A and B are equivalent, or the model abstained from choosing. ## compare ```python compare[T]( a: T, b: T, /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, ) -> Order ``` Standalone version of compare. Source code in `src/semlib/compare.py` ```python async def compare[T]( a: T, b: T, /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, ) -> Order: """Standalone version of [compare][semlib.compare.Compare.compare].""" comparator = Compare(model=model) return await comparator.compare(a, b, by=by, to_str=to_str, template=template, task=task) ``` ## compare_sync ```python compare_sync[T]( a: T, b: T, /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, ) -> Order ``` Standalone synchronous version of compare. Source code in `src/semlib/compare.py` ```python def compare_sync[T]( a: T, b: T, /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, ) -> Order: """Standalone synchronous version of [compare][semlib.compare.Compare.compare].""" comparator = Compare(model=model) return asyncio.run(comparator.compare(a, b, by=by, to_str=to_str, template=template, task=task)) ``` ::: # semlib.extrema ## min ```python min[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, max_concurrency: int | None = None, ) -> T ``` Standalone version of min. Source code in `src/semlib/extrema.py` ```python async def min[T]( # noqa: A001 iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, max_concurrency: int | None = None, ) -> T: """Standalone version of [min][semlib.extrema.Extrema.min].""" extrema = Extrema(model=model, max_concurrency=max_concurrency) return await extrema.min(iterable, by=by, to_str=to_str, template=template, task=task) ``` ## min_sync ```python min_sync[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, max_concurrency: int | None = None, ) -> T ``` Standalone synchronous version of min. Source code in `src/semlib/extrema.py` ```python def min_sync[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, max_concurrency: int | None = None, ) -> T: """Standalone synchronous version of [min][semlib.extrema.Extrema.min].""" extrema = Extrema(model=model, max_concurrency=max_concurrency) return asyncio.run(extrema.min(iterable, by=by, to_str=to_str, template=template, task=task)) ``` ## max ```python max[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, max_concurrency: int | None = None, ) -> T ``` Standalone version of max. Source code in `src/semlib/extrema.py` ```python async def max[T]( # noqa: A001 iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, max_concurrency: int | None = None, ) -> T: """Standalone version of [max][semlib.extrema.Extrema.max].""" extrema = Extrema(model=model, max_concurrency=max_concurrency) return await extrema.max(iterable, by=by, to_str=to_str, template=template, task=task) ``` ## max_sync ```python max_sync[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, max_concurrency: int | None = None, ) -> T ``` Standalone synchronous version of max. Source code in `src/semlib/extrema.py` ```python def max_sync[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, model: str | None = None, max_concurrency: int | None = None, ) -> T: """Standalone synchronous version of [max][semlib.extrema.Extrema.max].""" extrema = Extrema(model=model, max_concurrency=max_concurrency) return asyncio.run(extrema.max(iterable, by=by, to_str=to_str, template=template, task=task)) ``` ::: # semlib.filter ## filter ```python filter[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> list[T] ``` Standalone version of filter. Source code in `src/semlib/filter.py` ```python async def filter[T]( # noqa: A001 iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> list[T]: """Standalone version of [filter][semlib.filter.Filter.filter].""" filterer = Filter(model=model, max_concurrency=max_concurrency) return await filterer.filter(iterable, by=by, to_str=to_str, template=template, negate=negate) ``` ## filter_sync ```python filter_sync[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> list[T] ``` Standalone synchronous version of filter. Source code in `src/semlib/filter.py` ```python def filter_sync[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> list[T]: """Standalone synchronous version of [filter][semlib.filter.Filter.filter].""" filterer = Filter(model=model, max_concurrency=max_concurrency) return asyncio.run(filterer.filter(iterable, by=by, to_str=to_str, template=template, negate=negate)) ``` ::: # semlib.find ## find ```python find[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> T | None ``` Standalone version of find. Source code in `src/semlib/find.py` ```python async def find[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> T | None: """Standalone version of [find][semlib.find.Find.find].""" finder = Find(model=model, max_concurrency=max_concurrency) result = await finder.find(iterable, by=by, to_str=to_str, template=template, negate=negate) # binding to intermediate variable coro to avoid mypy bug, see https://github.com/python/mypy/issues/19716 and # https://github.com/python/mypy/pull/19767 (fixed now, but not shipped yet) return result # noqa: RET504 ``` ## find_sync ```python find_sync[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> T | None ``` Standalone synchronous version of find. Source code in `src/semlib/find.py` ```python def find_sync[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T], str] | None = None, negate: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> T | None: """Standalone synchronous version of [find][semlib.find.Find.find].""" finder = Find(model=model, max_concurrency=max_concurrency) coro = finder.find(iterable, by=by, to_str=to_str, template=template, negate=negate) # binding to intermediate variable coro to avoid mypy bug, see https://github.com/python/mypy/issues/19716 and # https://github.com/python/mypy/pull/19767 (fixed now, but not shipped yet) return asyncio.run(coro) ``` ::: # semlib.map ## map ```python map[T, U: BaseModel]( iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: type[U], model: str | None = None, max_concurrency: int | None = None, ) -> list[U] ``` ```python map[T, U]( iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: Bare[U], model: str | None = None, max_concurrency: int | None = None, ) -> list[U] ``` ```python map[T]( iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: None = None, model: str | None = None, max_concurrency: int | None = None, ) -> list[str] ``` Standalone version of map. Source code in `src/semlib/map.py` ```python async def map[T, U: BaseModel, V]( # noqa: A001 iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: type[U] | Bare[V] | None = None, model: str | None = None, max_concurrency: int | None = None, ) -> list[U] | list[V] | list[str]: """Standalone version of [map][semlib.map.Map.map].""" mapper = Map(model=model, max_concurrency=max_concurrency) return await mapper.map(iterable, template, return_type=return_type) ``` ## map_sync ```python map_sync[T, U: BaseModel]( iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: type[U], model: str | None = None, max_concurrency: int | None = None, ) -> list[U] ``` ```python map_sync[T, U]( iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: Bare[U], model: str | None = None, max_concurrency: int | None = None, ) -> list[U] ``` ```python map_sync[T]( iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: None = None, model: str | None = None, max_concurrency: int | None = None, ) -> list[str] ``` Standalone synchronous version of map. Source code in `src/semlib/map.py` ```python def map_sync[T, U: BaseModel, V]( iterable: Iterable[T], /, template: str | Callable[[T], str], *, return_type: type[U] | Bare[V] | None = None, model: str | None = None, max_concurrency: int | None = None, ) -> list[U] | list[V] | list[str]: """Standalone synchronous version of [map][semlib.map.Map.map].""" mapper = Map(model=model, max_concurrency=max_concurrency) return asyncio.run(mapper.map(iterable, template, return_type=return_type)) # type: ignore[return-value] ``` ::: # semlib.prompt ## prompt ```python prompt[T: BaseModel]( prompt: str, /, *, return_type: type[T], model: str | None = None, ) -> T ``` ```python prompt[T]( prompt: str, /, *, return_type: Bare[T], model: str | None = None, ) -> T ``` ```python prompt( prompt: str, /, *, return_type: None = None, model: str | None = None, ) -> str ``` Standalone version of prompt. Source code in `src/semlib/prompt.py` ```python async def prompt[T: BaseModel, U]( prompt: str, /, *, return_type: type[T] | Bare[U] | None = None, model: str | None = None ) -> str | T | U: """Standalone version of [prompt][semlib._internal.base.Base.prompt].""" base = Base(model=model) return await base.prompt(prompt, return_type=return_type) ``` ## prompt_sync ```python prompt_sync[T: BaseModel]( prompt: str, /, *, return_type: type[T], model: str | None = None, ) -> T ``` ```python prompt_sync[T]( prompt: str, /, *, return_type: Bare[T], model: str | None = None, ) -> T ``` ```python prompt_sync( prompt: str, /, *, return_type: None = None, model: str | None = None, ) -> str ``` Standalone synchronous version of prompt. Source code in `src/semlib/prompt.py` ```python def prompt_sync[T: BaseModel, U]( prompt: str, /, *, return_type: type[T] | Bare[U] | None = None, model: str | None = None ) -> str | T | U: """Standalone synchronous version of [prompt][semlib._internal.base.Base.prompt].""" base = Base(model=model) return asyncio.run(base.prompt(prompt, return_type=return_type)) # type: ignore[return-value] ``` ::: # semlib.reduce ## reduce ```python reduce( iterable: Iterable[str], /, template: str | Callable[[str, str], str], *, associative: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> str ``` ```python reduce[T]( iterable: Iterable[str | T], /, template: str | Callable[[str | T, str | T], str], *, associative: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> str | T ``` ```python reduce[T: BaseModel]( iterable: Iterable[T], /, template: str | Callable[[T, T], str], *, return_type: type[T], associative: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> T ``` ```python reduce[T]( iterable: Iterable[T], /, template: str | Callable[[T, T], str], *, return_type: Bare[T], associative: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> T ``` ```python reduce[T, U: BaseModel]( iterable: Iterable[T], /, template: str | Callable[[U, T], str], initial: U, *, return_type: type[U], model: str | None = None, max_concurrency: int | None = None, ) -> U ``` ```python reduce[T, U]( iterable: Iterable[T], /, template: str | Callable[[U, T], str], initial: U, *, return_type: Bare[U], model: str | None = None, max_concurrency: int | None = None, ) -> U ``` Standalone version of reduce. Source code in `src/semlib/reduce.py` ```python async def reduce( iterable: Iterable[Any], /, template: str | Callable[[Any, Any], str], initial: Any = None, *, return_type: Any = None, associative: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> Any: """Standalone version of [reduce][semlib.reduce.Reduce.reduce].""" reducer = Reduce(model=model, max_concurrency=max_concurrency) return await reducer.reduce( # type: ignore[call-overload] iterable, template, initial, return_type=return_type, associative=associative, ) ``` ## reduce_sync ```python reduce_sync( iterable: Iterable[str], /, template: str | Callable[[str, str], str], *, associative: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> str ``` ```python reduce_sync[T]( iterable: Iterable[str | T], /, template: str | Callable[[str | T, str | T], str], *, associative: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> str | T ``` ```python reduce_sync[T: BaseModel]( iterable: Iterable[T], /, template: str | Callable[[T, T], str], *, return_type: type[T], associative: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> T ``` ```python reduce_sync[T]( iterable: Iterable[T], /, template: str | Callable[[T, T], str], *, return_type: Bare[T], associative: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> T ``` ```python reduce_sync[T, U: BaseModel]( iterable: Iterable[T], /, template: str | Callable[[U, T], str], initial: U, *, return_type: type[U], model: str | None = None, max_concurrency: int | None = None, ) -> U ``` ```python reduce_sync[T, U]( iterable: Iterable[T], /, template: str | Callable[[U, T], str], initial: U, *, return_type: Bare[U], model: str | None = None, max_concurrency: int | None = None, ) -> U ``` Standalone synchronous version of reduce. Source code in `src/semlib/reduce.py` ```python def reduce_sync( iterable: Iterable[Any], /, template: str | Callable[[Any, Any], str], initial: Any = None, *, return_type: Any = None, associative: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> Any: """Standalone synchronous version of [reduce][semlib.reduce.Reduce.reduce].""" reducer = Reduce(model=model, max_concurrency=max_concurrency) return asyncio.run( reducer.reduce( # type: ignore[call-overload] iterable, template, initial, return_type=return_type, associative=associative, ) ) ``` ::: # semlib.sort ## Algorithm Bases: `ABC` Abstract base class for sorting algorithms. Sorting algorithms can be used with sort. Source code in `src/semlib/sort/algorithm/algorithm.py` ```python class Algorithm(ABC): """Abstract base class for sorting algorithms. Sorting algorithms can be used with [sort][semlib.sort.Sort.sort].""" def __init__(self) -> None: """Initialize.""" @abstractmethod async def _sort[T]( self, iterable: Iterable[T], /, *, reverse: bool = False, comparator: Callable[[T, T], Coroutine[None, None, Order]], max_concurrency: int, ) -> list[T]: ... ``` ### __init__ ```python __init__() -> None ``` Initialize. Source code in `src/semlib/sort/algorithm/algorithm.py` ```python def __init__(self) -> None: """Initialize.""" ``` ## QuickSort Bases: `Algorithm` Quicksort sorting algorithm. This algorithm uses the [Quicksort](https://en.wikipedia.org/wiki/Quicksort) method to sort items. This algorithm does **not** provide theoretical guarantees with noisy pairwise comparisons, but standard sorting algorithms can perform well in practice ([Qin et al., 2024](https://aclanthology.org/2024.findings-naacl.97/)) even with noisy pariwise comparisons using LLMs. This algorithm requires O(n log n) pairwise comparisons on average. If you want higher-quality rankings and can tolerate increased costs and latency, you can consider using the BordaCount algorithm instead. Source code in `src/semlib/sort/algorithm/quicksort.py` ```python class QuickSort(Algorithm): """Quicksort sorting algorithm. This algorithm uses the [Quicksort](https://en.wikipedia.org/wiki/Quicksort) method to sort items. This algorithm does **not** provide theoretical guarantees with noisy pairwise comparisons, but standard sorting algorithms can perform well in practice ([Qin et al., 2024](https://aclanthology.org/2024.findings-naacl.97/)) even with noisy pariwise comparisons using LLMs. This algorithm requires O(n log n) pairwise comparisons on average. If you want higher-quality rankings and can tolerate increased costs and latency, you can consider using the [BordaCount][semlib.sort.algorithm.BordaCount] algorithm instead. """ @override def __init__(self, *, randomized: bool = False): """Initialize. Args: randomized: If `True`, uses a randomized pivot selection strategy. This can help avoid worst-case O(n^2) performance on certain inputs, but results may be non-deterministic. If False, always uses the first item as the pivot. """ super().__init__() self._randomized = randomized @override async def _sort[T]( self, iterable: Iterable[T], /, *, reverse: bool = False, comparator: Callable[[T, T], Coroutine[None, None, Order]], max_concurrency: int, ) -> list[T]: lst = iterable if isinstance(iterable, list) else list(iterable) async def quicksort(lst: list[T]) -> list[T]: if len(lst) <= 1: return lst pivot_index = secrets.randbelow(len(lst)) if self._randomized else 0 pivot = lst[pivot_index] less = [] greater = [] equal = [pivot] # to handle "neither" case comparisons = await util.gather( *(comparator(item, pivot) for i, item in enumerate(lst) if i != pivot_index) ) for i, item in enumerate(lst): if i == pivot_index: continue comparison = comparisons[i if i < pivot_index else i - 1] if comparison == Order.LESS: less.append(item) elif comparison == Order.GREATER: greater.append(item) else: equal.append(item) sort_less, sort_greater = await util.gather(quicksort(less), quicksort(greater)) return sort_less + equal + sort_greater sort_list = await quicksort(lst) return sort_list[::-1] if reverse else sort_list ``` ### __init__ ```python __init__(*, randomized: bool = False) ``` Initialize. Parameters: | Name | Type | Description | Default | | ------------ | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------- | | `randomized` | `bool` | If True, uses a randomized pivot selection strategy. This can help avoid worst-case O(n^2) performance on certain inputs, but results may be non-deterministic. If False, always uses the first item as the pivot. | `False` | Source code in `src/semlib/sort/algorithm/quicksort.py` ```python @override def __init__(self, *, randomized: bool = False): """Initialize. Args: randomized: If `True`, uses a randomized pivot selection strategy. This can help avoid worst-case O(n^2) performance on certain inputs, but results may be non-deterministic. If False, always uses the first item as the pivot. """ super().__init__() self._randomized = randomized ``` ## BordaCount Bases: `Algorithm` Borda count sorting algorithm. This algorithm uses the [Borda count](https://en.wikipedia.org/wiki/Borda_count) method to rank items. The algorithm has good theoretical properties for finding approximate rankings based on noisy pairwise comparisons ([Shah and Wainwright, 2018](https://jmlr.org/papers/volume18/16-206/16-206.pdf)). This algorithm requires O(n^2) pairwise comparisons. If you want to reduce the number of comparisons (to reduce LLM costs), you can consider using the QuickSort algorithm instead. This algorithm is carefully implemented so that it has O(n) space complexity. Source code in `src/semlib/sort/algorithm/borda_count.py` ```python class BordaCount(Algorithm): """Borda count sorting algorithm. This algorithm uses the [Borda count](https://en.wikipedia.org/wiki/Borda_count) method to rank items. The algorithm has good theoretical properties for finding approximate rankings based on noisy pairwise comparisons ([Shah and Wainwright, 2018](https://jmlr.org/papers/volume18/16-206/16-206.pdf)). This algorithm requires O(n^2) pairwise comparisons. If you want to reduce the number of comparisons (to reduce LLM costs), you can consider using the [QuickSort][semlib.sort.algorithm.QuickSort] algorithm instead. This algorithm is carefully implemented so that it has O(n) space complexity. """ @override async def _sort[T]( self, iterable: Iterable[T], /, *, reverse: bool = False, comparator: Callable[[T, T], Coroutine[None, None, Order]], max_concurrency: int, ) -> list[T]: lst = iterable if isinstance(iterable, list) else list(iterable) scores: list[int] = [0 for _ in lst] async def fn(item: tuple[int, int]) -> None: i, j = item result_ij, result_ji = await util.gather(comparator(lst[i], lst[j]), comparator(lst[j], lst[i])) if result_ij == Order.LESS and result_ji == Order.GREATER: scores[j] += 1 scores[i] -= 1 elif result_ij == Order.GREATER and result_ji == Order.LESS: scores[i] += 1 scores[j] -= 1 await foreach( fn, ((i, j) for i in range(len(lst)) for j in range(i + 1, len(lst))), max_concurrency=max(1, max_concurrency // 2), # because each worker does two comparisons concurrently ) # stable sort sort_by_score = sorted([(scores[i], i, lst[i]) for i in range(len(lst))], reverse=reverse) return [item for _, _, item in sort_by_score] ``` ### __init__ ```python __init__() -> None ``` Initialize. Source code in `src/semlib/sort/algorithm/algorithm.py` ```python def __init__(self) -> None: """Initialize.""" ``` ## sort_sync ```python sort_sync[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, algorithm: Algorithm | None = None, reverse: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> list[T] ``` Standalone synchronous version of sort. Source code in `src/semlib/sort/sort.py` ```python def sort_sync[T]( iterable: Iterable[T], /, *, by: str | None = None, to_str: Callable[[T], str] | None = None, template: str | Callable[[T, T], str] | None = None, task: Task | str | None = None, algorithm: Algorithm | None = None, reverse: bool = False, model: str | None = None, max_concurrency: int | None = None, ) -> list[T]: """Standalone synchronous version of [sort][semlib.sort.Sort.sort].""" sorter = Sort( model=model, max_concurrency=max_concurrency, ) return asyncio.run( sorter.sort(iterable, by=by, to_str=to_str, template=template, task=task, algorithm=algorithm, reverse=reverse) ) ``` :::