This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Text Analytics API Reference

A comprehensive collection of text analytics endpoints that include text summarization, classification, named entity relationship (NER) extraction etc.

1: Text Language Detection
2: Named Entity Extraction
3: Text Summarization
4: Text Classification
5: Classify News Articles Using Smart Labels
6: Aspect Based Sentiments Analysis
7: Document Sentiments Score
8: Keyword or Keyphrase Extraction

The new generation of large language models (LLMs) such as GPT3, GPT4, and ChatGPT/GPT 3.5 have revolutionalized the way we analyze text data.

Out base level text classification endpoint includes few dozen topic labels; however if you are interested a more granular text classifier that contains IAB/IPTC + custom taxonomy) that contains over 1900 topics, please email us at info@specrom.com

Our comprehensive text analytics endpoints uses the latest GPT-J and/or GPT3.5/GPT4 models on the back end to analyze all aspects of the news articles.

1 - Text Language Detection

A Specrom News API endpoint for detecting the language of the input text.

An endpoint for detecting the language of the input text.


import requests

url = "https://specrom-news-api.p.rapidapi.com/"

payload = {
  "api_type": "language_detection",
  "input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}


headers = {
  "content-type": "application/json",
  "X-RapidAPI-Key": "API_Key",
  "X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

	
	POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 192

{
	"api_type": "language_detection",
	"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}

Parameter	Description
api_type	language_detection
input_text	The text whose language is to be detected

Output

Languages:
JSON


{"documents":
[
{"Detected_language":
[{"ISO631-1_language_code":"en","normalized_probability":1}]

,"id":"1"}]
}

This endpoint returns a JSON object containing the following elements:

Parameter	Description
documents	This is a list containing a single dictionary
ISO631-1_language_code	two letter detected language code. Take a look at all the language codes here.
normalized_probability	Probability estimate (0-1) indicating the level of certainity with the predicted value

2 - Named Entity Extraction

Extract Named Entities from the input text using our API endpoint.

Extract Named Entities from the input text using our API endpoint

Input

import requests

url = "https://specrom-news-api.p.rapidapi.com/"

payload = {
  "api_type": "named_entity_extraction",
  "input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}


headers = {
  "content-type": "application/json",
  "X-RapidAPI-Key": "API_key",
  "X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

	
	POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 96

{
    "api_type": "named_entity_extraction",
    "input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}

Parameter	Description
api_type	named_entity_extraction
input_text	input text


[
{
"$69.7 billion": "MONEY"
},
{
"2019": "DATE"
},
{
"more than 98%": "PERCENT"
},
{
"the year": "DATE"
},
{
"Coca Cola": "ORG"
},
{
"8 million": "CARDINAL"
},
{
"earlier this year": "DATE"
},
{
"100": "CARDINAL"
},
{
"$4.2 billion": "MONEY"
},
{
"last year": "DATE"
},
{
"only about 6%": "PERCENT"
},
{
"April 2019": "DATE"
},
{
"COO Sheryl Sandberg": "PERSON"
},
{
"100": "CARDINAL"
}
.
.
.
(Output Truncated)
]

Our API will extract the following entity types.

PERSON - People, including fictional.

NORP - Nationalities or religious or political groups.

FAC - Buildings, airports, highways, bridges, etc.

ORG - Companies, agencies, institutions, etc.

GPE - Countries, cities, states.

LOC - Non-GPE locations, mountain ranges, bodies of water.

PRODUCT - Objects, vehicles, foods, etc. (Not services.)

EVENT - Named hurricanes, battles, wars, sports events, etc.

WORK_OF_ART - Titles of books, songs, etc.

LAW - Named documents made into laws.

LANGUAGE - Any named language.

DATE - Absolute or relative dates or periods.

TIME - Times smaller than a day.

PERCENT - Percentage, including “%”.

MONEY - Monetary values, including unit.

QUANTITY - Measurements, as of weight or distance.

ORDINAL - “first”, “second”, etc.

CARDINAL - Numerals that do not fall under another type.

3 - Text Summarization

This endpoint will generate a summary of the entered text. It uses a state of the art LLM based abstractive summarization model.

The summary generated by this API endpoint is Abstractive in nature and will be similar to what you see using ChatGPT/GPT3.5, or other LLMs. If you are looking for our older extractive summarization endpoint, let us know and we can share that with you.

This endpoint will generate an abstractive summary of the entered text. It uses a state of the art LLM based abstractive summarization model.

Input

import requests

url = "https://specrom-news-api.p.rapidapi.com/"

payload = {
  "api_type": "summarization",
  "input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}

headers = {
  "content-type": "application/json",
  "X-RapidAPI-Key": "API_key",
  "X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

	
	POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 96

{
    "api_type": "summarization",
    "input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}

Parameter	Description
api_type	summarization
input_text	Input Text

Output

Languages:
JSON


The recent scandal and dropouts have impacted Facebook's stock price, but advertising still accounts for more than 98% of the company's revenue, with small and medium-sized businesses making up the majority of ad dollars spent. Facebook has 8 million advertisers, and the top 100 brands only contributed about 6% of the platform's ad revenue last year. Despite the controversy, it will take more to stop Facebook's digital advertising juggernaut.

4 - Text Classification

Input the headline or meta description of a news article and to generate a topic tag using a text classifier.

All our plans include a base level text classifier taxonomy; if you need a more granular text classifier that contains over 1900 topics, please email us at info@specrom.com

Fetch local news using geolocation coordinates (latitude, longitude) as input

The topics for the base level text classifier taxonomy are:

“arts and entertainment”, “automotive”, “business”, “careers”, “education”, “family and parenting”, “food and drink”, “health and fitness”, “hobbies and interests”, “home and garden”, “illegal content”, “law and government and politics”, “non standard content”, “personal finance”, “pets”, “real estate”, “religion and spirituality”, “science”, “shopping”, “society”, “sports”, “style and fashion”, “technology and computing”, “travel”

Input

import requests

url = "https://specrom-news-api.p.rapidapi.com/"

payload = {
  "api_type": "topic_detection_base_classifier",
  "input_text": "Top 20 Berkshire Hathaway holdings: What's in Warren Buffett portfolio going into 2023?"
}
headers = {
  "content-type": "application/json",
  "X-RapidAPI-Key": "API_key",
  "X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

	
	POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 96

{
    "api_type": "topic_detection_base_classifier",
    "input_text": "Top 20 Berkshire Hathaway holdings: What's in Warren Buffett portfolio going into 2023?"
}

Parameter	Description
api_type	topic_detection_base_classifier
input_text	input text

Output

Languages:
JSON


{"Topic":"business"}

Parameter	Description
Topic	Predicted topic tag of the input text

5 - Classify News Articles Using Smart Labels

A Specrom News API endpoint for labeling common types of articles seen on news websites

News domains these days have tons of different content besides news itself that help them monetize their website either by pushing pay per click ads, or through use of affiliate links. The smart labels classifier will help you identify such content.

Fetch the full_text, title, author, main image etc. from the news article using url_id as input.

Almost all news domains including major outlets such as New York Times publish lots of articles that are not hard news, but rather clickbaits.

Use our smart labels to identify and filter out such news articles by using headline and meta_description text as input. The labels are below:

Non-News: This category includes articles that are often referred to as “soft news.” Unlike traditional news articles that report on a specific event or breaking news, non-news articles focus on evergreen topics that are not time-sensitive. Examples of non-news articles include how-to guides, tips, reviews, and general profiles. These articles may be more feature-like in nature, and can often be enjoyed by readers at any time.
Opinion: This category includes articles that express a strong point of view, such as editorials, opinion pieces, letters to the editor, or other content that may be subjective in nature.
Paid News: This category includes articles that are sponsored or paid for by a brand or advertiser, often in the form of advertorials. The goal of these articles is typically to promote a product, service, or brand.
Pop Culture: This category covers articles related to entertainment and popular culture, such as stories about celebrities, movies, TV shows, music, fashion, and other trends.
Fact Check: This category includes articles that seek to verify the validity of rumors or questionable claims, with the goal of combating misinformation. Fact-checking articles typically provide evidence-based information and sources to support their claims.
Roundup: This category includes articles that summarize multiple stories or provide a collection of concepts, takeaways, data analysis, or lists. Roundup articles can be useful for readers who want to quickly get up-to-speed on a particular topic or trend.
Press Release: This category includes official statements or announcements, typically published by wire services and authored by organizations or PR professionals. Press releases may cover a variety of topics, such as new products, partnerships, or other news related to the organization.
News: This category includes traditional news articles that report on a specific event or breaking news. These articles are typically objective in nature and report on facts related to the event or news story.

Input

import requests

url = "https://specrom-news-api.p.rapidapi.com/"

payload = {
  "api_type": "smart_labels",
  "input_text": "Top 20 Berkshire Hathaway holdings: What's in Warren Buffett portfolio going into 2023?"
}
headers = {
  "content-type": "application/json",
  "X-RapidAPI-Key": API_Key,
  "X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

	
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 381

{
    "api_type": "smart_labels",
    "input_text": "Top 20 Berkshire Hathaway holdings: What's in Warren Buffett portfolio going into 2023?"
}

Parameter	Description
api_type	smart_labels
input_text	input text (generally the headline plus the text in meta_description tag)

Output

Languages:
JSON


{"Topic":"Non-news"}

Parameter	Description
Topic	The predicted topic of the news article

6 - Aspect Based Sentiments Analysis

A Specrom News API endpoint for extracting entities from the input text and predicting the sentiments towards each entities.

Extract topics (also known as aspects or entities) from the input text and predict the sentiment towards each of the topics.

If you are instead looking for document level sentiment score and label, you should look at our API endpoint that does that instead.

Input

import requests

url = "https://specrom-news-api.p.rapidapi.com/"

payload = {
  "api_type": "aspect_based_sentiments_analysis",
  "input_text": "This is a very solid device. Wonderful job, Apple!  The only thing unexpected about it was the weight... the dimensions are smaller than the old macbook air my wife had, but heavier.  Screen size is the same"
}

headers = {
  "content-type": "application/json",
  "X-RapidAPI-Key": API_Key,
  "X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

	
	POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 151

{
    "api_type": "aspect_based_sentiments_analysis",
    "input_text": "This is a very solid device. Wonderful job, Apple!  The only thing unexpected about it was the weight... the dimensions are smaller than the old macbook air my wife had, but heavier.  Screen size is the same"
}

Parameter	Description
api_type	aspect_based_sentiments_analysis
input_text	input text

Output

Languages:
JSON


{
  "Response": [
    {
      "Aspect": "Device",
      "Sentiment": "Positive"
    },
    {
      "Aspect": "Weight",
      "Sentiment": "Negative"
    },
    {
      "Aspect": "Dimensions",
      "Sentiment": "Positive"
    },
    {
      "Aspect": "Screen Size",
      "Sentiment": "Positive"
    }
  ]
}

Parameter	Description
Aspect	The extracted entities or aspects from the input text. Generally speaking, this will be some attribute, person, company, product, service etc.
Sentiment	A label of Positive, Negative or Neutral.

7 - Document Sentiments Score

A Specrom News API endpoint for predicting the document level sentiments score for the input text

This endpoint will take input text and predict an overall sentiments score for the document.

If you are instead looking for aspect level sentiment label, you should look at aspect based sentiments analysis API endpoint.

Input

import requests

url = "https://specrom-news-api.p.rapidapi.com/"

payload = {
  "api_type": "overall_sentiments_score",
  "input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}

headers = {
  "content-type": "application/json",
  "X-RapidAPI-Key": API_Key,
  "X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

	
	POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 151

{
    "api_type": "overall_sentiments_score",
    "input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}

Parameter	Description
api_type	overall_sentiments_score
input_text	input text

Output

Languages:
JSON


{"documents":[{"sentiments_score":0.5673076923076923,"id":"1"}]}

Parameter	Description
sentiments_score	A sentiment score from 0-1 with 1 being very positive and 0 being negative. Values closer to 0.5 are neutral.

8 - Keyword or Keyphrase Extraction

A Specrom News API endpoint for extracting keywords or keyphrases from the input text.

This endpoint will take input text and extract the most relevant keywords and keyphrases from the input text.

Currently this model is in beta and only available for English.

Input

import requests

url = "https://specrom-news-api.p.rapidapi.com/"

payload = {
  "api_type": "keyword_extraction",
  "input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}

headers = {
  "content-type": "application/json",
  "X-RapidAPI-Key": API_Key,
  "X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

	
	POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 151

{
    "api_type": "keyword_extraction",
    "input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}

Parameter	Description
api_type	keyword_extraction
input_text	input text

Output

Languages:
JSON


[
{"keyphrase":"digital advertising juggernaut","weight":0.01},
{"keyphrase":"dented Facebook stock","weight":0.01},
{"keyphrase":"Facebook stock price","weight":0.01},
{"keyphrase":"company digital advertising","weight":0.01},
{"keyphrase":"big-name dropouts","weight":0.03},
{"keyphrase":"dropouts have dented","weight":0.03},
{"keyphrase":"stock price","weight":0.03},
{"keyphrase":"price and prompted","weight":0.03},
{"keyphrase":"prompted leadership","weight":0.03},
{"keyphrase":"leadership to address","weight":0.03}]

Parameter	Description
keyphrase	Extracted keyword or keyphrase that is central to the concept/idea/person/company being discussed in the input_text
weight	A score (0-1) that indicates the overall weight of that keyword/keyphrase