jobdata

Vector Embeddings and Search API Documentation

Enhance job searches with semantic understanding through vector embeddings.

Table of contents

Introduction

The jobdata API provides vector embeddings and vector search capabilities, enabling advanced job search functionalities. These features allow users to leverage pre-generated embeddings for job posts and perform semantic searches to find job listings that are contextually similar to a given query. This is particularly useful for job search platforms, HR software, and market researchers who need to match job postings based on semantic meaning rather than just keyword matching.

Vector embeddings are numerical representations of text that capture the semantic meaning of the content. By using embeddings, the API can understand the context and meaning behind job descriptions, titles, and other text fields, allowing for more accurate and relevant search results.

Limitations

This service is still in an experimental phase. We try our best to provide a production ready and reliable experience at all times. It's also important to keep in mind that we're dependent on a single 3rd-party provider here (OpenAI in this case for generating vector embeddings through their text-embedding-3-small model) for the supply of all embeddings data.

In the event of a permanent disruption or disappearance of their service offering we'd switch to an open source/weights alternative (at least similar in performance) and backfill all job listings with newly generated embeddings available to our subscribed customers.

When using the vec_text query parameter to perform vector embedding matching through our API, the text value of the parameter is being sent to an OpenAI endpoint for instant embeddings generation. Although we have set Data retention to Disabled and don't store any queries by default on our own, we strongly recommend to not include any sensitive data in your queries and using generic search phrases or anonymized job profiles.

Endpoint Overview

Endpoint (list): /api/jobs/

Endpoint (single element): /api/jobs/{id}/

Method: GET

Authorization Required: Yes (access pro+ or access ultra subscription)

Description: The Jobs endpoint now supports vector embeddings and vector search, allowing users to retrieve job listings with pre-generated embeddings and perform semantic searches based on a text query.

Main Features

Vector Embeddings

The API now provides pre-generated embeddings for every job post using OpenAI's text-embedding-3-small model. These embeddings are available as half-precision floats at 768 dimensions:

  • 768-dimensional embeddings in half-precision (embed_3sh): These embeddings capture a representation of the job post's title and description, suitable for detailed semantic analysis.

To activate them in the API response, you can use the following query parameter:

  • embed_3sh: Set to true to include embeddings in the response.

Note: With OpenAI's text-embedding-3-small model you can shorten embeddings (i.e. remove some numbers from the end of the sequence) without the embedding losing its concept-representing properties.

The API now supports vector search, which allows you to find job listings that are semantically similar to a given text query. This is done by converting the query into embeddings in real-time and comparing it against the pre-generated embeddings of job posts using cosine similarity.

  • vec_text: This query parameter allows you to specify a text string (up to 1000 characters) for vector search. The API will return job listings that are most similar to the query based on their embeddings.

Note that searches are case-sensitive (even punctuation can make a difference!), meaning the casing of your query can influence the semantic meaning and the results returned. For example, a search for "Software Engineer" may yield different results compared to "software engineer" due to variations in how the model interprets the context. To ensure consistent and accurate results, consider standardizing the casing of your search queries before making API requests. This is particularly relevant when searching for job titles, technical terms, or industry-specific phrases.

Cosine Similarity Threshold

The API uses a cosine similarity threshold of 0.5 to filter out job posts that are not sufficiently similar to the query. This ensures that only relevant job listings are returned.

When using vector search (vec_text), a cosine_dist field (float) is added to each job listing in the response, indicating the cosine similarity distance between the query and the job post. This value helps you gauge the relevance of each result, with lower values representing higher semantic similarity to your search query. The results are automatically ordered by cosine_dist, ensuring that the most semantically similar job listings appear first, followed by those with progressively lower similarity.

Combining Vector Search with Other Filters

You can seamlessly combine a vector search query with our existing search and filter parameters available on the /api/jobs/ endpoint. This allows you to refine your semantic search results using additional criteria such as location, job type, experience level, salary range, and more.

For example, you can perform a vector search for "remote software engineer with AI experience" while filtering by country, salary range, or job type to narrow down the results. This combination of semantic understanding and precise filtering ensures that you receive the most relevant and targeted job listings for your specific needs.

Request

To make use of the new vector embeddings and search features, you can include the relevant query parameters in your request. Below are examples of how to make requests using curl and Python.

Using curl

Retrieve Job Listings with Embeddings

curl -X GET "https://jobdataapi.com/api/jobs/?embed_3sh=true" \
     -H "Authorization: Api-Key YOUR_API_KEY"
curl -X GET "https://jobdataapi.com/api/jobs/?vec_text=remote%20software%20engineer%20with%20AI%20experience" \
     -H "Authorization: Api-Key YOUR_API_KEY"

Using Python

Retrieve Job Listings with Embeddings

import requests

url = "https://jobdataapi.com/api/jobs/"
params = {
    "embed_3sh": True,
}
headers = {"Authorization": "Api-Key YOUR_API_KEY"}

response = requests.get(url, headers=headers, params=params)
print(response.json())

Perform Vector Search

import requests

url = "https://jobdataapi.com/api/jobs/"
params = {
    "vec_text": "remote software engineer with AI experience"
}
headers = {"Authorization": "Api-Key YOUR_API_KEY"}

response = requests.get(url, headers=headers, params=params)
print(response.json())

Response

When vector embeddings are activated, the response will include additional field for each job listing:

  • embedding_3sh: A 768-dimensional, half-precision floats vector embedding of the job post (title + description).

Example Response with Embeddings

{
  "count": 2,
  "next": null,
  "previous": null,
  "results": [
    {
      "id": 12345,
      "company": {
        "name": "Tech Innovations Inc.",
        "logo": "https://example.com/logo.png",
        "website_url": "https://techinnovations.com",
        "linkedin_url": "https://linkedin.com/company/tech-innovations-inc",
        ...
      },
      "title": "Senior Software Engineer",
      "location": "Remote",
      "has_remote": true,
      "published": "2024-02-01",
      "description": "We are looking for a Senior Software Engineer...",
      "experience_level": "SE",
      "application_url": "https://techinnovations.com/careers/12345",
      "salary_min": "100000",
      "salary_max": "150000",
      "salary_currency": "USD",
      "embedding_3sh": [0.123, 0.456, 0.789, ...],  // 768-dimensional half-precision embedding
    },
    // Additional job listings...
  ]
}
{
  "count": 3,
  "next": null,
  "previous": null,
  "results": [
    {
      "id": 12345,
      "company": {
        "name": "Tech Innovations Inc.",
        "logo": "https://example.com/logo.png",
        "website_url": "https://techinnovations.com",
        "linkedin_url": "https://linkedin.com/company/tech-innovations-inc",
        ...
      },
      "title": "Senior Software Engineer",
      "location": "Remote",
      "has_remote": true,
      "published": "2024-02-01",
      "description": "We are looking for a Senior Software Engineer with experience in AI...",
      "experience_level": "SE",
      "application_url": "https://techinnovations.com/careers/12345",
      "salary_min": "100000",
      "salary_max": "150000",
      "salary_currency": "USD"
      "cosine_dist": 0.36391339790563
    },
    {
      "id": 67890,
      "company": {
        "name": "AI Solutions Ltd.",
        "logo": "https://example.com/logo.png",
        "website_url": "https://aisolutions.com",
        "linkedin_url": "https://linkedin.com/company/ai-solutions-ltd",
        ...
      },
      "title": "AI Engineer",
      "location": "Remote",
      "has_remote": true,
      "published": "2024-02-01",
      "description": "We are hiring an AI Engineer to work on cutting-edge machine learning projects...",
      "experience_level": "MI",
      "application_url": "https://aisolutions.com/careers/67890",
      "salary_min": "90000",
      "salary_max": "120000",
      "salary_currency": "USD",
      "cosine_dist": 0.396054643510141
    },
    // Additional job listings...
  ]
}

Use Cases

Vector search allows you to find job listings that are semantically similar to their query, even if the exact keywords are not present in the job description. For example, a search for "remote software engineer with AI experience" could return job posts that mention "machine learning," "data science," or "artificial intelligence," even if those exact terms are not used.

Improved Matching Algorithms

HR and talent platforms can leverage vector search to find candidates whose profiles closely match the requirements of job postings. This can significantly enhance the recruitment process by identifying suitable candidates more efficiently.

Market Analysis

Researchers can use vector embeddings to analyze job market trends by clustering job postings based on their semantic content. This can provide insights into emerging skills, industries, and job roles.

Personalized Job Recommendations

Job search platforms can use vector embeddings to provide personalized job recommendations to users based on their search history, preferences, and profile information. This can improve user engagement and satisfaction.

Notes

  • Access Requirements: Vector embeddings and vector search features are available only with access pro+ and access ultra subscriptions.
  • Rate Limits: The API currently allows a limited set of generative vector search queries per day depending on the API access plan you subscribed to.
  • Embeddings Processing Time: Generating embeddings for new job posts may take up to a few hours to process and become available. If you enable the embed_3sh parameter or use vec_text, job listings that have not yet completed the embedding generation process will be excluded from the results. This ensures that only jobs with fully processed embeddings are returned. If you encounter missing jobs, try your request again after a short delay to allow the embeddings to be generated.
  • Vector Search vs. Full-Text Search: While vector search provides powerful semantic matching capabilities, it is not a replacement for full-text or other search and filtering methods. Full-text search remains essential for precise keyword-based queries, such as searching for exact phrases or specific terms. Vector search excels at understanding context and meaning, making it ideal for broader, concept-based queries, but it may not always capture exact matches or highly specific criteria.

Conclusion

By leveraging these features, you can perform more accurate and context-aware searches, match job postings with candidate profiles, and gain deeper insights into job market trends. Whether you're building a job search platform, conducting market research, or developing HR solutions, these capabilities will enhance your ability to deliver relevant and meaningful results.

With pre-generated embeddings and built-in vector search, there’s no need for subscribed customers to roll their own infrastructure to generate embeddings for every job post. We handle all the complexity, allowing you to focus on delivering value to your users.

Related Docs

Multi-value Parameters Documentation
Job Countries API Endpoint Documentation
Jobs API Endpoint Documentation
Full-Text Search on Job Descriptions
Job Cities API Endpoint Documentation
Job Types API Endpoint Documentation
CSV File Downloads Documentation
Job States API Endpoint Documentation
Job Regions API Endpoint Documentation
Jobs Expired API Endpoint Documentation