Vector Embeddings and Search API Documentation
Enhance job searches with semantic understanding through vector embeddings.
Table of contents
Introduction
The jobdata API provides vector embeddings and vector search capabilities, enabling advanced job search functionalities. These features allow users to leverage pre-generated embeddings for job posts and perform semantic searches to find job listings that are contextually similar to a given query. This is particularly useful for job search platforms, HR software, and market researchers who need to match job postings based on semantic meaning rather than just keyword matching.
Vector embeddings are numerical representations of text that capture the semantic meaning of the content. By using embeddings, the API can understand the context and meaning behind job descriptions, titles, and other text fields, allowing for more accurate and relevant search results.
Limitations
This service is still in an experimental phase. We try our best to provide a production ready and reliable experience at all times. It's also important to keep in mind that we're dependent on a single 3rd-party provider here (OpenAI in this case for generating vector embeddings through their text-embedding-3-small
model) for the supply of all embeddings data.
In the event of a permanent disruption or disappearance of their service offering we'd switch to an open source/weights alternative (at least similar in performance) and backfill all job listings with newly generated embeddings available to our subscribed customers.
Privacy (Vector Search)
When using the vec_text
query parameter to perform vector embedding matching through our API, the text value of the parameter is being sent to an OpenAI endpoint for instant embeddings generation. Although we have set Data retention to Disabled and don't store any queries by default on our own, we strongly recommend to not include any sensitive data in your queries and using generic search phrases or anonymized job profiles.
Endpoint Overview
Endpoint (list): /api/jobs/
Endpoint (single element): /api/jobs/{id}/
Method: GET
Authorization Required: Yes (access pro+ or access ultra subscription)
Description: The Jobs endpoint now supports vector embeddings and vector search, allowing users to retrieve job listings with pre-generated embeddings and perform semantic searches based on a text query.
Main Features
Vector Embeddings
The API now provides pre-generated embeddings for every job post using OpenAI's text-embedding-3-small
model. These embeddings are available as half-precision floats at 768 dimensions:
- 768-dimensional embeddings in half-precision (
embed_3sh
): These embeddings capture a representation of the job post's title and description, suitable for detailed semantic analysis.
To activate them in the API response, you can use the following query parameter:
embed_3sh
: Set totrue
to include embeddings in the response.
Note: With OpenAI's text-embedding-3-small model you can shorten embeddings (i.e. remove some numbers from the end of the sequence) without the embedding losing its concept-representing properties.
Vector Search
The API now supports vector search, which allows you to find job listings that are semantically similar to a given text query. This is done by converting the query into embeddings in real-time and comparing it against the pre-generated embeddings of job posts using cosine similarity.
vec_text
: This query parameter allows you to specify a text string (up to 1000 characters) for vector search. The API will return job listings that are most similar to the query based on their embeddings.
Note that searches are case-sensitive (even punctuation can make a difference!), meaning the casing of your query can influence the semantic meaning and the results returned. For example, a search for "Software Engineer" may yield different results compared to "software engineer" due to variations in how the model interprets the context. To ensure consistent and accurate results, consider standardizing the casing of your search queries before making API requests. This is particularly relevant when searching for job titles, technical terms, or industry-specific phrases.
Cosine Similarity Threshold
The API uses a cosine similarity threshold of 0.5 to filter out job posts that are not sufficiently similar to the query. This ensures that only relevant job listings are returned.
When using vector search (vec_text
), a cosine_dist
field (float) is added to each job listing in the response, indicating the cosine similarity distance between the query and the job post. This value helps you gauge the relevance of each result, with lower values representing higher semantic similarity to your search query. The results are automatically ordered by cosine_dist
, ensuring that the most semantically similar job listings appear first, followed by those with progressively lower similarity.
Combining Vector Search with Other Filters
You can seamlessly combine a vector search query with our existing search and filter parameters available on the /api/jobs/
endpoint. This allows you to refine your semantic search results using additional criteria such as location, job type, experience level, salary range, and more.
For example, you can perform a vector search for "remote software engineer with AI experience" while filtering by country, salary range, or job type to narrow down the results. This combination of semantic understanding and precise filtering ensures that you receive the most relevant and targeted job listings for your specific needs.
Request
To make use of the new vector embeddings and search features, you can include the relevant query parameters in your request. Below are examples of how to make requests using curl
and Python.
Using curl
Retrieve Job Listings with Embeddings
curl -X GET "https://jobdataapi.com/api/jobs/?embed_3sh=true" \
-H "Authorization: Api-Key YOUR_API_KEY"
Perform Vector Search
curl -X GET "https://jobdataapi.com/api/jobs/?vec_text=remote%20software%20engineer%20with%20AI%20experience" \
-H "Authorization: Api-Key YOUR_API_KEY"
Using Python
Retrieve Job Listings with Embeddings
import requests
url = "https://jobdataapi.com/api/jobs/"
params = {
"embed_3sh": True,
}
headers = {"Authorization": "Api-Key YOUR_API_KEY"}
response = requests.get(url, headers=headers, params=params)
print(response.json())
Perform Vector Search
import requests
url = "https://jobdataapi.com/api/jobs/"
params = {
"vec_text": "remote software engineer with AI experience"
}
headers = {"Authorization": "Api-Key YOUR_API_KEY"}
response = requests.get(url, headers=headers, params=params)
print(response.json())
Response
When vector embeddings are activated, the response will include additional field for each job listing:
embedding_3sh
: A 768-dimensional, half-precision floats vector embedding of the job post (title + description).
Example Response with Embeddings
{
"count": 2,
"next": null,
"previous": null,
"results": [
{
"id": 12345,
"company": {
"name": "Tech Innovations Inc.",
"logo": "https://example.com/logo.png",
"website_url": "https://techinnovations.com",
"linkedin_url": "https://linkedin.com/company/tech-innovations-inc",
...
},
"title": "Senior Software Engineer",
"location": "Remote",
"has_remote": true,
"published": "2024-02-01",
"description": "We are looking for a Senior Software Engineer...",
"experience_level": "SE",
"application_url": "https://techinnovations.com/careers/12345",
"salary_min": "100000",
"salary_max": "150000",
"salary_currency": "USD",
"embedding_3sh": [0.123, 0.456, 0.789, ...], // 768-dimensional half-precision embedding
},
// Additional job listings...
]
}
Example Response for Vector Search
{
"count": 3,
"next": null,
"previous": null,
"results": [
{
"id": 12345,
"company": {
"name": "Tech Innovations Inc.",
"logo": "https://example.com/logo.png",
"website_url": "https://techinnovations.com",
"linkedin_url": "https://linkedin.com/company/tech-innovations-inc",
...
},
"title": "Senior Software Engineer",
"location": "Remote",
"has_remote": true,
"published": "2024-02-01",
"description": "We are looking for a Senior Software Engineer with experience in AI...",
"experience_level": "SE",
"application_url": "https://techinnovations.com/careers/12345",
"salary_min": "100000",
"salary_max": "150000",
"salary_currency": "USD"
"cosine_dist": 0.36391339790563
},
{
"id": 67890,
"company": {
"name": "AI Solutions Ltd.",
"logo": "https://example.com/logo.png",
"website_url": "https://aisolutions.com",
"linkedin_url": "https://linkedin.com/company/ai-solutions-ltd",
...
},
"title": "AI Engineer",
"location": "Remote",
"has_remote": true,
"published": "2024-02-01",
"description": "We are hiring an AI Engineer to work on cutting-edge machine learning projects...",
"experience_level": "MI",
"application_url": "https://aisolutions.com/careers/67890",
"salary_min": "90000",
"salary_max": "120000",
"salary_currency": "USD",
"cosine_dist": 0.396054643510141
},
// Additional job listings...
]
}
Use Cases
Semantic Job Search
Vector search allows you to find job listings that are semantically similar to their query, even if the exact keywords are not present in the job description. For example, a search for "remote software engineer with AI experience" could return job posts that mention "machine learning," "data science," or "artificial intelligence," even if those exact terms are not used.
Improved Matching Algorithms
HR and talent platforms can leverage vector search to find candidates whose profiles closely match the requirements of job postings. This can significantly enhance the recruitment process by identifying suitable candidates more efficiently.
Market Analysis
Researchers can use vector embeddings to analyze job market trends by clustering job postings based on their semantic content. This can provide insights into emerging skills, industries, and job roles.
Personalized Job Recommendations
Job search platforms can use vector embeddings to provide personalized job recommendations to users based on their search history, preferences, and profile information. This can improve user engagement and satisfaction.
Notes
- Access Requirements: Vector embeddings and vector search features are available only with access pro+ and access ultra subscriptions.
- Rate Limits: The API currently allows a limited set of generative vector search queries per day depending on the API access plan you subscribed to.
- Embeddings Processing Time: Generating embeddings for new job posts may take up to a few hours to process and become available. If you enable the
embed_3sh
parameter or usevec_text
, job listings that have not yet completed the embedding generation process will be excluded from the results. This ensures that only jobs with fully processed embeddings are returned. If you encounter missing jobs, try your request again after a short delay to allow the embeddings to be generated. - Vector Search vs. Full-Text Search: While vector search provides powerful semantic matching capabilities, it is not a replacement for full-text or other search and filtering methods. Full-text search remains essential for precise keyword-based queries, such as searching for exact phrases or specific terms. Vector search excels at understanding context and meaning, making it ideal for broader, concept-based queries, but it may not always capture exact matches or highly specific criteria.
Conclusion
By leveraging these features, you can perform more accurate and context-aware searches, match job postings with candidate profiles, and gain deeper insights into job market trends. Whether you're building a job search platform, conducting market research, or developing HR solutions, these capabilities will enhance your ability to deliver relevant and meaningful results.
With pre-generated embeddings and built-in vector search, there’s no need for subscribed customers to roll their own infrastructure to generate embeddings for every job post. We handle all the complexity, allowing you to focus on delivering value to your users.