jobdata

CSV File Downloads Documentation

Full job data download infos and sample files.

Table of contents

Introduction

The jobdata API access ultra package provides a comprehensive solution for accessing and managing job data. With both API and CSV download options, you can integrate up-to-date job and company information into your projects efficiently. Using the data download feature over accessing the API offers several key advantages, particularly for users who need comprehensive datasets for analysis, offline access, or bulk data processing.

The CSV file downloads provide a full snapshot of all job data updated weekly, ensuring that you have access to the most current information without the need for continuous API calls. This method is more efficient for large-scale data handling, allowing for quicker integration into local databases and systems.

Additionally, data downloads are particularly beneficial for data scientists and analysts who require complete datasets for advanced analytics and reporting, offering a more streamlined and cost-effective solution compared to the incremental data retrieval typically done via API calls.

Overview

The Jobdata API access ultra package provides extensive data access and download capabilities, including:

  • Unlimited access to the Jobdata API feed with comprehensive job post backfill and related full company info.
  • Expired jobs info.
  • Job description text search.
  • Advanced company data search.
  • Weekly-updated job data CSV file downloads.

A fresh export of the latest data happens every Sunday around noon UTC.

Data Download URLs

Data download URLs are generated with unique download access keys to maintain privacy and allow programmatic access. Every file represents its API endpoint counterpart with all the same attributes that are available through the access pro subscription package (except for the description_string and location_string values; drop us a quick email if you need these included as well).

In addition to that and to maintain all object relationships there are the following tables that reflect many-to-many relations ships between the job table and its counterparts:

  • job_job_types.csv: Contains all applicable job types for every job.
  • job_job_cities.csv: Contains all associated city selections for every job.
  • job_job_states.csv: Contains all state/canton/administrative region data (referenced by city).
  • job_job_countries.csv: Contains all associated country selections for every job.
  • job_job_regions.csv: Contains all associated region selections for every job.

Here are the formats for the all download links available:

  • Industries: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/industries.csv.gz
  • Company types: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/company_types.csv.gz
  • Companies: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/companies.csv.gz
  • Job types: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/job_types.csv.gz
  • Job regions: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/job_regions.csv.gz
  • Job countries: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/job_countries.csv.gz
  • Job states: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/job_states.csv.gz
  • Job cities: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/job_cities.csv.gz
  • Jobs: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/jobs.csv.gz
  • Job type relations: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/job_job_types.csv.gz
  • Job city relations: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/job_job_cities.csv.gz
  • Job country relations: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/job_job_countries.csv.gz
  • Job region relations: https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/job_job_regions.csv.gz

These links are designed for easy integration into automated systems, ensuring that your data remains up-to-date with minimal effort.

You get access to them right after subscribing to the access ultra package by clicking "Generate new links" from your dashboard. Note that the download access key is always different from your API access key and can only be used as part of your download link URLs. Generating a new download access key always invalidates the previous one and all its associated download URLs become invalid.

Sample Files and SQLite Database Demo

To help you get started, we provide sample files containing recent information of 100 companies and their most recent 5000 jobs. Additionally, an SQLite database example demonstrates how the data can be reassembled.

Sample Files

The following CSV files are the exact same format that the full data files would be when subscribed, downloadable from your dashboard under "Download access":

For more details on attributes and field types you can also refer to the corresponding API endpoint documentation.

SQLite Demo Database

A full SQLite3 database filled with the sample data from above CSV files can be downloaded here: demo.sqlite3

SQLite Database Schema

The example schema for the SQLite database can be found here: sqlite_demo_schema.sql

Examples

Data downloads are particularly beneficial for data scientists and analysts who require complete datasets for advanced analytics and reporting, offering a more streamlined and cost-effective solution compared to the incremental data retrieval typically done via API calls. Here are some simple use cases for the data download feature, along with corresponding code examples:

Use Case 1: Market Analysis

Objective: Analyze job market trends to identify high-demand skills and emerging industries.

Steps:

1. Download and load data:

curl -o jobs.csv.gz https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/jobs.csv.gz
curl -o companies.csv.gz https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/companies.csv.gz
gunzip jobs.csv.gz
gunzip companies.csv.gz

2. Analyze job postings:

import pandas as pd

jobs = pd.read_csv('jobs.csv')
companies = pd.read_csv('companies.csv')

# Merge jobs with companies to get complete job data
job_data = pd.merge(jobs, companies, left_on='company_id', right_on='id')

# Identify high-demand skills by counting occurrences in job descriptions
skills = ['Python', 'JavaScript', 'SQL', 'Java', 'AWS']
skill_counts = {skill: job_data['description'].str.contains(skill, case=False).sum() for skill in skills}

print(skill_counts)

3. Visualize results:

import matplotlib.pyplot as plt

plt.bar(skill_counts.keys(), skill_counts.values())
plt.xlabel('Skills')
plt.ylabel('Demand Count')
plt.title('High-Demand Skills in Job Postings')
plt.show()

Use Case 2: Competitive Analysis

Objective: Analyze competitors' hiring patterns and job distribution.

Steps:

1. Download and load data:

curl -o jobs.csv.gz https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/jobs.csv.gz
curl -o companies.csv.gz https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/companies.csv.gz
gunzip jobs.csv.gz
gunzip companies.csv.gz

2. Analyze competitor hiring patterns:

import pandas as pd

jobs = pd.read_csv('jobs.csv')
companies = pd.read_csv('companies.csv')

# Filter for competitors' job postings (assuming competitor IDs are known)
competitor_ids = [101, 102, 103]  # Example competitor IDs
competitor_jobs = jobs[jobs['company_id'].isin(competitor_ids)]

# Count job postings by competitor
competitor_counts = competitor_jobs['company_id'].value_counts()

# Map company names for better readability
company_names = companies.set_index('id')['name'].to_dict()
competitor_counts.index = competitor_counts.index.map(company_names)

print(competitor_counts)

3. Visualize results:

import matplotlib.pyplot as plt

competitor_counts.plot(kind='bar')
plt.xlabel('Competitor')
plt.ylabel('Number of Job Postings')
plt.title('Competitor Hiring Patterns')
plt.show()

Use Case 3: Salary Analysis

Objective: Analyze salary distributions across different job titles and industries.

Steps:

1. Download and load data:

curl -o jobs.csv.gz https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/jobs.csv.gz
gunzip jobs.csv.gz

2. Analyze salary data:

import pandas as pd

jobs = pd.read_csv('jobs.csv')

# Filter out jobs with missing salary information
salary_data = jobs.dropna(subset=['salary_min', 'salary_max'])

# Calculate average salary
salary_data['average_salary'] = (salary_data['salary_min'] + salary_data['salary_max']) / 2

# Analyze by job title
title_salary = salary_data.groupby('title')['average_salary'].mean().sort_values(ascending=False)

print(title_salary.head(10))  # Top 10 job titles by average salary

3. Visualize results:

import matplotlib.pyplot as plt

title_salary.head(10).plot(kind='bar')
plt.xlabel('Job Title')
plt.ylabel('Average Salary')
plt.title('Top 10 Job Titles by Average Salary')
plt.show()

Objective: Analyze trends in job postings over time to identify seasonal patterns and growth areas.

Steps:

1. Download and load data:

curl -o jobs.csv.gz https://jobdataapi.com/download/<DOWNLOAD_ACCESS_KEY>/jobs.csv.gz
gunzip jobs.csv.gz

2. Analyze job posting trends:

import pandas as pd

jobs = pd.read_csv('jobs.csv')

# Convert published date to datetime
jobs['published'] = pd.to_datetime(jobs['published'])

# Group by month and count job postings
monthly_trends = jobs.groupby(jobs['published'].dt.to_period('M')).size()

print(monthly_trends)

3. Visualize results:

import matplotlib.pyplot as plt

monthly_trends.plot(kind='line')
plt.xlabel('Month')
plt.ylabel('Number of Job Postings')
plt.title('Job Posting Trends Over Time')
plt.show()

By leveraging the data download feature, these use cases illustrate how various stakeholders can derive actionable insights from comprehensive job and company datasets, enabling informed decision-making and strategic planning.