Retrieving and Working with Industry Data for Imported Jobs
A step-by-step guide through the process of retrieving industry information from the companies associated with the jobs you've already imported.
Table of contents
Prerequisites
Before diving into the tutorial, ensure you have the following:
- API Access: You must have a valid API key for the jobdata API. If you don’t have one, please subscribe to get access.
- Imported Job Data: You should have already imported job data locally using the
/api/jobs
endpoint. - Basic Knowledge of REST APIs: Understanding how to interact with RESTful APIs using tools like
curl
or programming languages such as Python. - Python Installed: We’ll use Python for the API calls in this tutorial. Ensure you have Python 3.x installed on your machine.
Understanding the Job Data Schema
When you import job data using the /api/jobs
endpoint, each job entry contains a company
object. This object includes basic information about the company that posted the job, such as the company ID, name, and logo. However, it does not directly include industry information.
Example Job Data Response
{
"id": 123,
"title": "Software Engineer",
"company": {
"id": 45,
"name": "Tech Innovators Ltd",
"logo": "https://example.com/logo.png",
"website_url": "https://techinnovators.com",
"linkedin_url": "https://linkedin.com/company/tech-innovators",
"twitter_handle": "@techinnovators"
},
"location": "San Francisco, CA",
"published": "2023-08-01T12:00:00Z",
"description": "Job description here...",
"experience_level": "MI",
"salary_min": "80000",
"salary_max": "120000",
"salary_currency": "USD",
"application_url": "https://techinnovators.com/careers/12345"
}
In the above response, the company
object only includes the company ID (id
), but not the industry information. We need to retrieve the industry data separately.
Step 1: Fetching Job Data
If you haven’t already imported job data, you can use the /api/jobs
endpoint to fetch it. Here’s how you can do it using Python.
Example API Call to Fetch Jobs
import requests
url = "https://jobdataapi.com/api/jobs"
headers = {
"Authorization": "Api-Key YOUR_API_KEY"
}
params = {
"page": 1,
"page_size": 500
}
response = requests.get(url, headers=headers, params=params)
if response.status_code == 200:
jobs = response.json()["results"]
print("Fetched jobs:", jobs)
else:
print("Failed to fetch jobs:", response.status_code, response.text)
This code fetches a list of jobs with the associated company data. You can adjust the page_size
parameter to retrieve more jobs at once. The response will include the company ID, which is essential for the next steps.
Step 2: Extracting Company IDs
Once you have the job data locally, extract the company IDs. These IDs are required to query the company information from the jobdata API.
Example: Extracting Company IDs
company_ids = [job["company"]["id"] for job in jobs]
print("Extracted Company IDs:", company_ids)
This code snippet extracts the id
field from the company
object in each job entry and stores it in a list called company_ids
.
Step 3: Retrieving Company Information
With the company IDs in hand, you can now query the /api/companies
endpoint to retrieve detailed information about each company, including the industry they operate in.
Example API Call to Fetch Company Information
company_info_url = "https://jobdataapi.com/api/companies"
company_info_list = []
for company_id in company_ids:
response = requests.get(f"{company_info_url}/{company_id}", headers=headers)
if response.status_code == 200:
company_info = response.json()
company_info_list.append(company_info)
print(f"Retrieved data for company ID {company_id}: {company_info['name']}")
else:
print(f"Failed to retrieve company info for ID {company_id}: {response.status_code}")
# Example output:
# Retrieved data for company ID 45: Tech Innovators Ltd
This code iterates through the list of company IDs and fetches the detailed information for each company. The industry information is included in the response under the info_industry
key.
Example Company Data Response
{
"id": 45,
"name": "Tech Innovators Ltd",
"logo": "https://example.com/logo.png",
"website_url": "https://techinnovators.com",
"linkedin_url": "https://linkedin.com/company/tech-innovators",
"twitter_handle": "@techinnovators",
"github_url": "https://github.com/techinnovators",
"info_description": "A leading tech company specializing in AI solutions.",
"info_hq": "San Francisco, CA",
"info_size": 200,
"info_founded": 2010,
"info_specialties": "AI, Machine Learning, Robotics",
"info_industry": {
"id": 3,
"name": "Information Technology"
},
"info_type": {
"id": 2,
"name": "Private"
}
}
In this response, the info_industry
object contains both the id
and name
of the industry.
Step 4: Extracting Industry Data
Now that you have the detailed company information, you can extract the industry data for each company.
Example: Extracting Industry Data
industry_data = [
{
"company_id": company["id"],
"company_name": company["name"],
"industry_id": company["info_industry"]["id"],
"industry_name": company["info_industry"]["name"]
}
for company in company_info_list
]
print("Extracted Industry Data:", industry_data)
This code snippet creates a list of dictionaries, each containing the company ID, company name, industry ID, and industry name. This data can be used for further processing or storage.
Step 5: Storing and Utilizing Industry Data
Once you have extracted the industry data, you may want to store it in your local database or use it directly in your application. Here’s how you can approach this.
Example: Storing Industry Data in a Database
Assuming you’re using SQLite for local storage, here’s an example of how to store the industry data.
import sqlite3
# Connect to SQLite database (or create it)
conn = sqlite3.connect("jobs.db")
cursor = conn.cursor()
# Create a table for storing industry data
cursor.execute("""
CREATE TABLE IF NOT EXISTS industries (
company_id INTEGER PRIMARY KEY,
company_name TEXT,
industry_id INTEGER,
industry_name TEXT
)
""")
conn.commit()
# Insert the extracted industry data into the table
cursor.executemany("""
INSERT INTO industries (company_id, company_name, industry_id, industry_name)
VALUES (:company_id, :company_name, :industry_id, :industry_name)
""", industry_data)
conn.commit()
print("Industry data stored successfully.")
Example: Querying Industry Data
After storing the industry data, you can query it to enhance your job listings, generate reports, or perform further analysis.
# Example query: Get all companies in the 'Information Technology' industry
cursor.execute("""
SELECT * FROM industries WHERE industry_name = 'Information Technology'
""")
it_companies = cursor.fetchall()
print("Companies in the Information Technology industry:", it_companies)
This query retrieves all companies that operate in the "Information Technology" industry from the industries
table.
Conclusion
By following this tutorial, you’ve learned how to:
- Fetch job data using the jobdata API.
- Extract company IDs from imported job data.
- Retrieve detailed company information, including industry data.
- Extract and store industry information locally.
- Utilize the stored industry data for further analysis or integration into your application.
Enriching your job listings with industry information adds significant value to your application, enabling better categorization, filtering, and insights. This tutorial should serve as a foundation for further enhancements and customizations based on your specific use case.