This is the multi-page printable view of this section. Click here to print.
Documentation
- 1: News API Reference
- 1.1: Simple News Search
- 1.2: Get News Headlines By Keyword Search
- 1.3: Get Latest News By City, State, Country
- 1.4: Get Latest News By Latitude, Longitude
- 1.5: Fetch Parsed News Article By url_id
- 1.6: Fetch Parsed News Article By URL
- 2: Text Analytics API Reference
- 2.1: Text Language Detection
- 2.2: Named Entity Extraction
- 2.3: Text Summarization
- 2.4: Text Classification
- 2.5: Classify News Articles Using Smart Labels
- 2.6: Aspect Based Sentiments Analysis
- 2.7: Document Sentiments Score
- 2.8: Keyword or Keyphrase Extraction
- 3: Technology Analysis API Reference
- 3.1: Built With Analyzer
- 4: Social Media & Email Addresses API Reference
1 - News API Reference
1.1 - Simple News Search
We reccomend that all the new users try out this API endpoint before using the other more advanced keyword search endpoints only when they either need more results (using similar keyword search), or need results with very low latency (instead of 24h in this case) or historical search.
This endpoint will fetch news articles with full-text using variety of inputs.
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "exhaustive_news_search_24h",
"q": "bitcoin",
"author_only": "",
"content": "",
"domains": "",
"page": "1",
"qInTitle": "",
"topic": ""
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_Key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 192
{
"api_type": "exhaustive_news_search_24h",
"q": "bitcoin",
"author_only": "",
"content": "",
"domains": "",
"page": "1",
"qInTitle": "",
"topic": ""
}
Parameter | Description |
---|---|
api_type | exhaustive_news_search_24h |
domains | The domain url for filtering by news source. In this version, you can only specify one domain address. In case you do not want to filter by domain address, than simply leave the field empty ("") |
topic: | Filter the news articles by their topic. Options include: “politics”, “tech”, “entertainment”, “business”, “sport”. If you want to fetch all articles without topic filtering, than simply specify “”. |
q | filter the news articles by a keyword; optionally you can simply leave it empty ("") if you do not wish to filter by keyword. Only one keyword is supported by this API; if you want to search by phrases, than we reccomend using our keyword news search endpoint |
qInTitle | filter the news articles by a keyword present in the title. Only one keyword is supported by this endpoint. |
content | If you want full text of the news articles in the content key of the response JSON then set this field to “true” otherwise set it to “false”. |
author_only (optional) | if “true” then it only returns results with author names. |
Output
{
"Article": [
{
"author": "[\"Lauren Fox and Ted Barrett, CNN\"]",
"description": "Divisions within the Republican conference spilled out once again Tuesday as GOP senators dismissed key pieces of their own leadership's stimulus proposal not even a day after its release.",
"publishedAt": "2020-07-28",
"source_name": "CNN",
"source_url": "cnn.com",
"title": "Republicans revolt against GOP's initial stimulus plan - CNNPolitics",
"url": "https://www.cnn.com/2020/07/28/politics/republican-reaction-gop-stimulus-plan/index.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+rss%2Fcnn_allpolitics+%28RSS%3A+CNN+-+Politics%29",
"urlToImage": "https://cdn.cnn.com/cnnnext/dam/assets/200518181852-senator-ben-sasse-1-super-tease.jpg"
},
.
.
(output truncated)
],
"status": "ok",
"totalResults": 113
}
This endpoint returns a JSON object containing the following elements:
Parameter | Description |
---|---|
status | This key will return “ok” if results from all the shards are being fetched correctly. Please raise a support ticket with us by email to info@specrom.com if is reporting a “fail”. |
totalResults | this mentions the total number of articles found by our endpoint. Note: this is not exhaustive results for You can use our other keyword search endpoint |
Article | This key contains a list of dictionaries representing individual news articles. |
Author | A list of author names for the news articles. |
content: | Full text of the news articles (upto 30% text); we trncate the full_text at 30% to respect the rights of the original copyright holder. If your use-case does’nt involve making the full_text public (such as using it for internal data analytics, training an AI model etc) , then you can get the entire full_text of each news article using fetch complete news article by URL endpoint. |
title: | title of the news article |
description | description of the news article extracted and parsed from meta tags of the HTML page. |
publishedAt | publication date of the news article (based on on UTC timezone). |
source_name | Name of the news source. |
source_url | domain address of the news source. |
url | url of the news article |
urlToImage | url of the most relevant image from the news article. |
1.2 - Get News Headlines By Keyword Search
This API endpoint is excellent for running queries with multiple keywords or keyphrases.
Fetch near real time news articles by searching using a keyword or keyphrase
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "news_by_keyword_search",
"keyword": "Donald Trump"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 96
{
"api_type": "news_by_keyword_search",
"keyword": "Donald Trump"
}
Parameter | Description |
---|---|
api_type | news_by_keyword_search |
country | two digit country code |
keyword | Search using the keyword or keyphrase. In case of a keyphrase containing more than two words (say Donald Trump), the keyphrase searched will be “Donald AND Trump”. |
Output
[
{
"source_url":"",
"source name":"",
"published_date":"2023-04-12T10:01:50Z",
"title":"Trump Leads DeSantis In Our New 2024 Republican Primary Polling Average",
"matched_text":"Donald Trump",
"url_id":"W2KracEB6QSg2AMRzLoV0JlBt5QiEJiIoMNEgq9l7vYR4F5yktDFc32OboItXqVpM80wmoALot3dO1a4YNQWuI/Ip4i2Ez2XIpgXyX1yz5RTuzkfMiNOVA4RAZzso5tMkzucFHXVOjNAECDOswQ0Dv1iqTt8JthEm4asfBCGdlqbxLG8vgkThGMlSfxKFNMdz8CpwbKAvYlnAJE2pXKPei1srKjY9Ce1Q0fKpG9HjEhIm06bsEtEtfBRDm8ApfpQLXhvpP20DNJuRHp+QaUNZqFIW6tbBxKac9YCepD5Ac/EpZAI+Ho3tu8h/JPwKoo8SV2ZkEBQGq6E3gutpF0arOpZN1fkkihSETsV0uCXSA==z0JGhuTDznqBaiJ6cMa0QA==k3ZkWXy8+3HBa/h2NQJI4A=="
},
{
"source_url":"",
"source name":"",
"published_date":"2023-04-17T05:07:24Z",
"title":"Trump news – latest: SNL mocks Trump for claim NYPD ‘cried’ during arrest",
"matched_text":"Donald Trump",
"url_id":"UcP9TICRECQitflcKckLSuPVuVymb1DSxJxLopzwZjfaTlzjvtj0vQIwavE4smL7pEUdjKCxAeIqHOPqUYbAYINA88tUx1cQAzxplf5+bcibNJlG+I7rzqTe/6yuch+mGJejhSdzRwUMAfkzQ1FAO4uyzUueVOmPYa+SAnJA/xR5MPKyoJA45rvWLkrUMKptw4vW6eLKAch7lZX976IFqy6Yt3+7vONHBNdN+1QQIjhN1yzJ9hEkww9QWiIbyCPh4z1hgj82AoMApw96nj5LX1LqqGxa/m1WeMFyi02JTf8g0YxcoFoyL0uFjg42y0Cm5vmaXu9FZYwd/aoCFZHzUUP+VfLeA31g7wGRA7096GKbDbhrQU8Xbu2hwLQAMqQidjM1hcE8ShPysXbiKxStKaRG1X256aTIXGSG3EqrRU6M38uwXdeiZs3rypRed5DjUhtxjSdMdvXihhZJKVWT1zCoUzXRTRvfWwfinkLUDTczUcA75/owNrRDq0EVRdvkjzS4JA==btCc5KU/AkvM/Rb1sPfo7g==J5YrGi3Qg8ZgePxkRBF2sg==vuJWPkwttgAi+hWMLJBcng=="
},
.
.
.
(output truncated)
]
Parameter | Description |
---|---|
source url | web domain of the media outlet |
source name | name of the media outlet |
publishedAt | date of the news article |
title | title of the news article |
matched_text | input keyword |
url_id | unique url_id of the news article. Use our separate endpoint called fetch news article by url_id to fetch complete details such as full_text, author, url etc. of the news article |
1.3 - Get Latest News By City, State, Country
This API endpoint is excellent for fetching local and regional news.
Fetch local news using city, state, country as input
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "news_by_city_country",
"country": "US",
"region": "Atlanta, GA"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 96
{
"api_type": "news_by_city_country",
"country": "US",
"region": "Atlanta, GA"
}
Parameter | Description |
---|---|
api_type | news_by_city_country |
country | two digit country code |
region | city name followed by state code |
A list of country codes are below
Country | Country Code |
---|---|
USA | US |
Japan | JP |
United Kingdom | UK |
Spain | ES |
Canada | CA |
Deutschland | DE |
Italia | IT |
France | FR |
Australia | AU |
Taiwan | TW |
Nederland | NL |
Brasil | BR |
Turkey | TR |
Belgium | BE |
Greece | GR |
India | IN |
Mexico | MX |
Denmark | DK |
Argentina | AR |
Switzerland | CH |
Chile | CL |
Austria | AT |
Korea | KR |
Ireland | IE |
Colombia | CO |
Poland | PL |
Portugal | PT |
Pakistan | PK |
Output
{
"article_list": [
{
"source url": "https://www.wsbtv.com",
"source name": "WSB Atlanta",
"published_date": "Fri, 07 Apr 2023 00:49:00 GMT",
"title": "Credit union members say their money being held hostage, can’t access funds - WSB Atlanta",
"url_id": "O8sAqcnybFac+cCXoC/TYt4Wwn5V+FmLjbXhqb98G0qrCZdRnAYHraHKEITnpaXK0J9+5t++Ccd0ddV8HIed/rayUGYk3zqWEHocNJuTD23tEwEhz1td+LENnUbwNS+0xPYlHTgC0p/AT3cU0KqcX+aAIRI8iSwgYj/ChiW31+Vlh1qFp4RFejj8SoGj1lD20vx9redXFlhKj7m/rIkSfTVHbOq+DpmapNZdNjB3lvgAn7DVwNcA5g+gbt0WvHLc4gVRfkjHu3+QjaimG6QPwk4fhJpZWPPR/r8kxMxSkjNbkFvxpgsitCAhc+ryo/wZAahCVR9n0Z9B6MtCN42XcHzuh2JhHZz7twcg0uSFcYXZDsDWtRUxZ4NMddEvOonxMLEOx+6dYS0MSadPK65iHDpVUcGkUHD+O+Nl9b12/1EUGNF6rQ8GPH3GuVu90ky4KJV5DbiH8oPm1+3C7JOa+FiqmB86zP5k9cZ2QyrGexKcOym7ywneq6PSo9hPJS0cvVCY/jilk4r01mT22Tdf9HrlbPvYkbYBbhePiIadMDPAdYDK1GJncq208uSOh+Sf2O+iAIIdSjEXGatgRNYSZTc7*f8QvuCUSqeMUelzr87slBA==*cnjF1ZxjpZgeXTHFwVsazQ==*MkCp3nEWhsKyC4fFJU9XjQ=="
},
{
"source url": "https://www.wsbtv.com",
"source name": "WSB Atlanta",
"published_date": "Fri, 07 Apr 2023 02:59:00 GMT",
"title": "Training facility back in compliance after stop work order issued, order to be lifted Friday morning - WSB Atlanta",
"url_id": "NjEV3/1FQf8ffd3Ey5AtAYDsZ2I7X4qmI+hVWNX9asd4xPf8ARn3VowPd6A4WQHhrjTO+etcIqKlAjN9sCQQWAImueJVxzPWl69eBaeS2VlpmJ6GcaJlrpCGlj8wJYLwXg2JsvjU6mJnYYF4NCY6dV+Rl+Ng/ilfGCKoe+FBCAbnMpFbX+vOw3bxp7VwnndM+0J5DGZVfO4U/Npu9eX25E8uGVawfctsbBB11p72+0pqQjsYPFBani+ioiR+Xnq7srLMZ8ta5U0lvkAJ3Ya4bc8aMDlb+GAww2zWvugMRH14npLO8SNMY1GbA6cMhtEMbXCLsLALqXotRMFIt3pr67EJ5CMEINdSn3To7lMId9y1vSRfHvtvqNktlXFs+Y3a6QSmq4/8RJE2Zd+m5MNkPGc3b5YtsXab53ohf2fxILlePq7oPqdJWsQ4/gp/z5gsvvwRzdcdDnkc8NJBYn4ZjS1hA1P3CCZHdkFtcOx/J8n+vGHygkgoXW6VvmC3r8SyKvvvN4PmCZyH8z7suLGbPToQviPIAwp94HRkb1bOINAUdxbOEtKoF5MgOLsF7lI8ClW7IJ4Ge3Dz5jCKLQVbzwHyAmcZdnYsEuVdDYjb5LgpORCWSh+xhw==*LgI2UsNQctdt3mLM4TFArg==*6kVcegsMBPGdTGK0X7YHAg==*Gm1QJtk0CUrYnX97xKY0mw=="
},
.
.
.
(Output Truncated)
}
]
}
Parameter | Description |
---|---|
article_list | A list of dictionaries containing individual news articles |
source url | web domain of the media outlet |
source name | name of the media outlet |
publishedAt | date of the news article |
title | title of the news article |
url_id | unique url_id of the news article. Use our separate endpoint called fetch news article by url_id to fetch complete details such as full_text, author, url etc. of the news article |
1.4 - Get Latest News By Latitude, Longitude
This API endpoint is excellent for fetching local and regional news.
Fetch local news using geolocation coordinates (latitude, longitude) as input
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "news_by_latitude_longitude",
"lat": "40.730610",
"longitude": "-73.935242"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 96
{
"api_type": "news_by_latitude_longitude",
"lat": "40.730610",
"longitude": "-73.935242"
}
Parameter | Description |
---|---|
api_type | news_by_latitude_longitude |
lat | Latitude |
longitude | Longitude |
Output
{
"article_list": [
{
source url:"https://www.foxnews.com"
source name:"Fox News"
published_date:"Thu, 06 Apr 2023 20:22:00 GMT"
title:"NYC man pleads guilty to selling fatal dose of fentanyl-laced heroin to actor Michael K. Williams - Fox News"
url_id:"slCdc3TihL57gNfNqq0hoOVXZy6YoUKEpq2whk9bqGcZVkhol9SRfr1pBGSG5OkysZDHwBqs7ZU57OPCzS+btyMS9kR3yqGtMXtalXoa+xYl+aoQyHwxfUbcJRzY75pvpCAdTRX5PsFHk3xbbXHixP2guu5WWRGJHWebSwuICQa56SuV95ZBUD4Xo2xvG+KfFQkNm7xSVuJMcQNUsXp0KLHrYPK0Peyso76PU0W5cldmqX7ncYOGzW5NakzWBV9xa9r7texCX+LgyGqXCru9faxXqDuibTeM3lSPoZDow8vnfk3/qibj88q+iJEhXB2ISbPEP2XttnJyrpAlSfwKIcfAtuvuLB8MPgmypVS9sXqie99sWmW8n44y7pyRCYl2rS2DwpFzQL5Fh+RwoQjeAzBfAU33udpCY+D60uEa70+4/50i/TVLgqTUmrzAqZOuNtjoEwhdwlxltBA0Wb9cT6jQfOtRweefGA==*9PU3DmkN3DblUuPpApib0g==*6b/1vdOgFWV3Qt3IwCRG5g==*mMg42UOs9Pfh1f8vjQq0Ig=="
},
.
.
.
(Output Truncated)
}
]
}
Parameter | Description |
---|---|
article_list | A list of dictionaries containing individual news articles |
source url | web domain of the media outlet |
source name | name of the media outlet |
publishedAt | date of the news article |
title | title of the news article |
url_id | unique url_id of the news article. Use our separate endpoint called fetch news article by url_id to fetch complete details such as full_text, author, url etc. of the news article |
1.5 - Fetch Parsed News Article By url_id
This is a simple API endpoint that will take in a url_id (generated by our other endpoints) of any news article and return parsed data containing full_text, title, author, main image.
Fetch the full_text, title, author, main image etc. from the news article using url_id as input.
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "fetch_parsed_news_article_by_id",
"url_id": "2+W9MSIWD4PJ5kF6fcRThhpsnzW92j46dBr8UEiHBKwSjaiOKBMnTWcmr5BXqe0vwpgP9bpT8r6Fgc8MnaYXYyidz9gl7WjxTCFUpKW/scAuJegSL86lw4C+M4Ea9/rPgFkTu1MkA34cs5/QXMTMH6J9xG18dcQG4sbe6a2pV1mY6jzCudZauazA13CO+NVFJH/R7UBIsT42lm2RPY4ISlPHuSh6y5AjrUiCwx4=*o3X7JdWv94Q2Td9x31pIFA==*fRxtKu8HwfTNo++zlEVHnA==*DXgO4zPEi1nWqrKZwosVwA=="
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": API_Key,
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 381
{
"api_type": "fetch_parsed_news_article_by_id",
"url_id": "2+W9MSIWD4PJ5kF6fcRThhpsnzW92j46dBr8UEiHBKwSjaiOKBMnTWcmr5BXqe0vwpgP9bpT8r6Fgc8MnaYXYyidz9gl7WjxTCFUpKW/scAuJegSL86lw4C+M4Ea9/rPgFkTu1MkA34cs5/QXMTMH6J9xG18dcQG4sbe6a2pV1mY6jzCudZauazA13CO+NVFJH/R7UBIsT42lm2RPY4ISlPHuSh6y5AjrUiCwx4=*o3X7JdWv94Q2Td9x31pIFA==*fRxtKu8HwfTNo++zlEVHnA==*DXgO4zPEi1nWqrKZwosVwA=="
}
Parameter | Description |
---|---|
api_type | fetch_parsed_news_article_by_id |
url_id | url_id of the news article. This is generated by our other endpoints. |
Output
{"author":["Perry Stein","Shayna Jacobs"],
"content":"Former president Donald Trump on Tuesday pleaded not guilty to 34 counts stemming from 2016 hush-money payments," ....(output truncated),
"meta_description":"When will he appear in court next? What is the discovery process? Your questions and more answered.",
"og_title":"What’s next for Trump after pleading not guilty to 34 felony counts","
publishedAt":"2023-04-05",
"source_url":"washingtonpost.com",
"title":"What’s next for Trump after pleading not guilty to 34 felony counts",
"url":"https://www.washingtonpost.com/nation/2023/04/05/trump-indictment-whats-next/",
"urlToImage":"https://www.washingtonpost.com/wp-apps/imrs.php?src=https://arc-anglerfish-washpost-prod-washpost.s3.amazonaws.com/public/BSK27QJRAJAVHDMPKYPSIUYUWM.jpg&w=1440"}
Parameter | Description |
---|---|
author | A list of authors of the news article |
content | full_text of the news article |
meta_description | text from the meta_description tag on the page |
og_title | text from the og_title tag on the page |
publishedAt | date of the news article |
source_url | source domain of the news article |
title | title of the news article |
url | canonical url of the news article |
urlToImage | primary image URL from the news article |
1.6 - Fetch Parsed News Article By URL
This is a simple API endpoint that will take in a URL of any news article and return parsed data containing full_text, title, author, main image.
Fetch the full_text, title, author, main image etc. from the news article using URL as input.
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "fetch_parsed_news_article_by_url",
"url": "https://edition.cnn.com/2020/06/30/tech/facebook-ad-business-boycott/index.html"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": API_Key,
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 151
{
"api_type": "fetch_parsed_news_article_by_url",
"url": "https://edition.cnn.com/2020/06/30/tech/facebook-ad-business-boycott/index.html"
}
Parameter | Description |
---|---|
api_type | fetch_parsed_news_article_by_url |
url | URL of the news article |
Output
{
"author": [
"Rishi Iyengar"
],
"content": "San Francisco CNN Business —\n\nEach day, more household names join the list of brands suspending advertising on Facebook to protest what they say are the social network’s failures to stop the spread of hate. On Monday alone, Adidas (ADDDF), HP (HPQ), and Ford (F) added their names to a list that already had Unilever (UL), The North Face, Coca Cola (CCHGY), Honda (HMC) and many others.\n\n"........(output truncated),
"meta_description": "Each day, more household names join the list of brands suspending advertising on Facebook to protest what they say are the social network's failures to stop the spread of hate. On Monday alone, Adidas, HP, and Ford added their names to a list that already had Unilever, The North Face, Coca Cola, Honda and many others.",
"og_title": "Here's how big Facebook's ad business really is | CNN Business",
"publishedAt": "2020-06-30",
"source_url": "cnn.com",
"title": "Here’s how big Facebook’s ad business really is",
"url": "https://www.cnn.com/2020/06/30/tech/facebook-ad-business-boycott/index.html",
"urlToImage": "https://media.cnn.com/api/v1/images/stellar/prod/200630015227-facebook-ads-stock.jpg?q=x_3,y_243,h_1684,w_2993,c_crop/w_800"
}
Parameter | Description |
---|---|
author | A list of authors of the news article |
content | full_text of the news article |
meta_description | text from the meta_description tag on the page |
og_title | text from the og_title tag on the page |
publishedAt | date of the news article |
source_url | source domain of the news article |
title | title of the news article |
url | canonical url of the news article |
urlToImage | primary image URL from the news article |
2 - Text Analytics API Reference
The new generation of large language models (LLMs) such as GPT3, GPT4, and ChatGPT/GPT 3.5 have revolutionalized the way we analyze text data.
Out base level text classification endpoint includes few dozen topic labels; however if you are interested a more granular text classifier that contains IAB/IPTC + custom taxonomy) that contains over 1900 topics, please email us at info@specrom.com
Our comprehensive text analytics endpoints uses the latest GPT-J and/or GPT3.5/GPT4 models on the back end to analyze all aspects of the news articles.
2.1 - Text Language Detection
An endpoint for detecting the language of the input text.
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "language_detection",
"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_Key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 192
{
"api_type": "language_detection",
"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}
Parameter | Description |
---|---|
api_type | language_detection |
input_text | The text whose language is to be detected |
Output
{"documents":
[
{"Detected_language":
[{"ISO631-1_language_code":"en","normalized_probability":1}]
,"id":"1"}]
}
This endpoint returns a JSON object containing the following elements:
Parameter | Description |
---|---|
documents | This is a list containing a single dictionary |
ISO631-1_language_code | two letter detected language code. Take a look at all the language codes here. |
normalized_probability | Probability estimate (0-1) indicating the level of certainity with the predicted value |
2.2 - Named Entity Extraction
Extract Named Entities from the input text using our API endpoint
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "named_entity_extraction",
"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 96
{
"api_type": "named_entity_extraction",
"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}
Parameter | Description |
---|---|
api_type | named_entity_extraction |
input_text | input text |
Output
[
{
"$69.7 billion": "MONEY"
},
{
"2019": "DATE"
},
{
"more than 98%": "PERCENT"
},
{
"the year": "DATE"
},
{
"Coca Cola": "ORG"
},
{
"8 million": "CARDINAL"
},
{
"earlier this year": "DATE"
},
{
"100": "CARDINAL"
},
{
"$4.2 billion": "MONEY"
},
{
"last year": "DATE"
},
{
"only about 6%": "PERCENT"
},
{
"April 2019": "DATE"
},
{
"COO Sheryl Sandberg": "PERSON"
},
{
"100": "CARDINAL"
}
.
.
.
(Output Truncated)
]
Our API will extract the following entity types.
PERSON - People, including fictional.
NORP - Nationalities or religious or political groups.
FAC - Buildings, airports, highways, bridges, etc.
ORG - Companies, agencies, institutions, etc.
GPE - Countries, cities, states.
LOC - Non-GPE locations, mountain ranges, bodies of water.
PRODUCT - Objects, vehicles, foods, etc. (Not services.)
EVENT - Named hurricanes, battles, wars, sports events, etc.
WORK_OF_ART - Titles of books, songs, etc.
LAW - Named documents made into laws.
LANGUAGE - Any named language.
DATE - Absolute or relative dates or periods.
TIME - Times smaller than a day.
PERCENT - Percentage, including “%”.
MONEY - Monetary values, including unit.
QUANTITY - Measurements, as of weight or distance.
ORDINAL - “first”, “second”, etc.
CARDINAL - Numerals that do not fall under another type.
2.3 - Text Summarization
The summary generated by this API endpoint is Abstractive in nature and will be similar to what you see using ChatGPT/GPT3.5, or other LLMs. If you are looking for our older extractive summarization endpoint, let us know and we can share that with you.
This endpoint will generate an abstractive summary of the entered text. It uses a state of the art LLM based abstractive summarization model.
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "summarization",
"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 96
{
"api_type": "summarization",
"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}
Parameter | Description |
---|---|
api_type | summarization |
input_text | Input Text |
Output
The recent scandal and dropouts have impacted Facebook's stock price, but advertising still accounts for more than 98% of the company's revenue, with small and medium-sized businesses making up the majority of ad dollars spent. Facebook has 8 million advertisers, and the top 100 brands only contributed about 6% of the platform's ad revenue last year. Despite the controversy, it will take more to stop Facebook's digital advertising juggernaut.
2.4 - Text Classification
All our plans include a base level text classifier taxonomy; if you need a more granular text classifier that contains over 1900 topics, please email us at info@specrom.com
Fetch local news using geolocation coordinates (latitude, longitude) as input
The topics for the base level text classifier taxonomy are:
“arts and entertainment”, “automotive”, “business”, “careers”, “education”, “family and parenting”, “food and drink”, “health and fitness”, “hobbies and interests”, “home and garden”, “illegal content”, “law and government and politics”, “non standard content”, “personal finance”, “pets”, “real estate”, “religion and spirituality”, “science”, “shopping”, “society”, “sports”, “style and fashion”, “technology and computing”, “travel”
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "topic_detection_base_classifier",
"input_text": "Top 20 Berkshire Hathaway holdings: What's in Warren Buffett portfolio going into 2023?"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 96
{
"api_type": "topic_detection_base_classifier",
"input_text": "Top 20 Berkshire Hathaway holdings: What's in Warren Buffett portfolio going into 2023?"
}
Parameter | Description |
---|---|
api_type | topic_detection_base_classifier |
input_text | input text |
Output
{"Topic":"business"}
Parameter | Description |
---|---|
Topic | Predicted topic tag of the input text |
2.5 - Classify News Articles Using Smart Labels
News domains these days have tons of different content besides news itself that help them monetize their website either by pushing pay per click ads, or through use of affiliate links. The smart labels classifier will help you identify such content.
Fetch the full_text, title, author, main image etc. from the news article using url_id as input.
Almost all news domains including major outlets such as New York Times publish lots of articles that are not hard news, but rather clickbaits.
Use our smart labels to identify and filter out such news articles by using headline and meta_description text as input. The labels are below:
-
Non-News: This category includes articles that are often referred to as “soft news.” Unlike traditional news articles that report on a specific event or breaking news, non-news articles focus on evergreen topics that are not time-sensitive. Examples of non-news articles include how-to guides, tips, reviews, and general profiles. These articles may be more feature-like in nature, and can often be enjoyed by readers at any time.
-
Opinion: This category includes articles that express a strong point of view, such as editorials, opinion pieces, letters to the editor, or other content that may be subjective in nature.
-
Paid News: This category includes articles that are sponsored or paid for by a brand or advertiser, often in the form of advertorials. The goal of these articles is typically to promote a product, service, or brand.
-
Pop Culture: This category covers articles related to entertainment and popular culture, such as stories about celebrities, movies, TV shows, music, fashion, and other trends.
-
Fact Check: This category includes articles that seek to verify the validity of rumors or questionable claims, with the goal of combating misinformation. Fact-checking articles typically provide evidence-based information and sources to support their claims.
-
Roundup: This category includes articles that summarize multiple stories or provide a collection of concepts, takeaways, data analysis, or lists. Roundup articles can be useful for readers who want to quickly get up-to-speed on a particular topic or trend.
-
Press Release: This category includes official statements or announcements, typically published by wire services and authored by organizations or PR professionals. Press releases may cover a variety of topics, such as new products, partnerships, or other news related to the organization.
-
News: This category includes traditional news articles that report on a specific event or breaking news. These articles are typically objective in nature and report on facts related to the event or news story.
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "smart_labels",
"input_text": "Top 20 Berkshire Hathaway holdings: What's in Warren Buffett portfolio going into 2023?"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": API_Key,
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 381
{
"api_type": "smart_labels",
"input_text": "Top 20 Berkshire Hathaway holdings: What's in Warren Buffett portfolio going into 2023?"
}
Parameter | Description |
---|---|
api_type | smart_labels |
input_text | input text (generally the headline plus the text in meta_description tag) |
Output
{"Topic":"Non-news"}
Parameter | Description |
---|---|
Topic | The predicted topic of the news article |
2.6 - Aspect Based Sentiments Analysis
Extract topics (also known as aspects or entities) from the input text and predict the sentiment towards each of the topics.
If you are instead looking for document level sentiment score and label, you should look at our API endpoint that does that instead.
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "aspect_based_sentiments_analysis",
"input_text": "This is a very solid device. Wonderful job, Apple! The only thing unexpected about it was the weight... the dimensions are smaller than the old macbook air my wife had, but heavier. Screen size is the same"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": API_Key,
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 151
{
"api_type": "aspect_based_sentiments_analysis",
"input_text": "This is a very solid device. Wonderful job, Apple! The only thing unexpected about it was the weight... the dimensions are smaller than the old macbook air my wife had, but heavier. Screen size is the same"
}
Parameter | Description |
---|---|
api_type | aspect_based_sentiments_analysis |
input_text | input text |
Output
{
"Response": [
{
"Aspect": "Device",
"Sentiment": "Positive"
},
{
"Aspect": "Weight",
"Sentiment": "Negative"
},
{
"Aspect": "Dimensions",
"Sentiment": "Positive"
},
{
"Aspect": "Screen Size",
"Sentiment": "Positive"
}
]
}
Parameter | Description |
---|---|
Aspect | The extracted entities or aspects from the input text. Generally speaking, this will be some attribute, person, company, product, service etc. |
Sentiment | A label of Positive, Negative or Neutral. |
2.7 - Document Sentiments Score
This endpoint will take input text and predict an overall sentiments score for the document.
If you are instead looking for aspect level sentiment label, you should look at aspect based sentiments analysis API endpoint.
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "overall_sentiments_score",
"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": API_Key,
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 151
{
"api_type": "overall_sentiments_score",
"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}
Parameter | Description |
---|---|
api_type | overall_sentiments_score |
input_text | input text |
Output
{"documents":[{"sentiments_score":0.5673076923076923,"id":"1"}]}
Parameter | Description |
---|---|
sentiments_score | A sentiment score from 0-1 with 1 being very positive and 0 being negative. Values closer to 0.5 are neutral. |
2.8 - Keyword or Keyphrase Extraction
This endpoint will take input text and extract the most relevant keywords and keyphrases from the input text.
Currently this model is in beta and only available for English.
Input
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "keyword_extraction",
"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": API_Key,
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 151
{
"api_type": "keyword_extraction",
"input_text": "While the big-name dropouts have dented Facebook's stock price and prompted leadership to address some of the concerns , it'll take a lot more to stop the company's digital advertising juggernaut. Facebook generated $69.7 billion from advertising in 2019, more than 98% of its total revenue for the year. And most of those ad dollars don't come from companies likeand Coca Cola so much as the sprawling list of small and medium-sized businesses who use Facebook to attract customers and build their brands. Facebook has 8 million advertisers, it said earlier this year. Of those, the highest-spending 100 brands accounted for $4.2 billion in Facebook advertising last year, according to data from marketing research firm Pathmatics -- or only about 6% of the platform's ad revenue. The last time Facebook shared that data itself was in April 2019 , when COO Sheryl Sandberg said the top 100 advertisers represented "
}
Parameter | Description |
---|---|
api_type | keyword_extraction |
input_text | input text |
Output
[
{"keyphrase":"digital advertising juggernaut","weight":0.01},
{"keyphrase":"dented Facebook stock","weight":0.01},
{"keyphrase":"Facebook stock price","weight":0.01},
{"keyphrase":"company digital advertising","weight":0.01},
{"keyphrase":"big-name dropouts","weight":0.03},
{"keyphrase":"dropouts have dented","weight":0.03},
{"keyphrase":"stock price","weight":0.03},
{"keyphrase":"price and prompted","weight":0.03},
{"keyphrase":"prompted leadership","weight":0.03},
{"keyphrase":"leadership to address","weight":0.03}]
Parameter | Description |
---|---|
keyphrase | Extracted keyword or keyphrase that is central to the concept/idea/person/company being discussed in the input_text |
weight | A score (0-1) that indicates the overall weight of that keyword/keyphrase |
3 - Technology Analysis API Reference
Technology researchers and people who are interested in technology trends on a larger scale are target users for these API endpoints.
Are you looking to answer questions such as these?
Which is the most common web technologies powering a news website in Atlanta, GA area?
or
Which technologies are being used by niche news blogs that focus on cryptocurrency or bitcoin?
or
What proportion of technology news outlets themselves are running vulnerable or outdated versions of wordpress to power their own websites?"
You build your own technology database like buildwith.com by roughly using the workflow below:
-
Using keywords and source_url as input, you can search for latest news articles using either simple news search or Get news headlines by keyword search
-
Alternately, you can get geography based news articles by searching for news using Get Latest News By City, State, Country or getting news by geolocation i.e. latitude & longitude
-
If you only have url_id than use one of our endpoints to get the URL and full_text of the news articles
-
Using the URL as input, use the buildwith endpoint below to get all the technologies that power the particular webpage.
3.1 - Built With Analyzer
This API endpoint is of interest to people who are trying to analyze technology trends and adoptions in the news and media industry.
You can consider this endpoint to be akin to unofficial buildwith.com API.
A few months ago, a major cybersecurity firm used this API to check on wordpress versions being used by major media outlets, and they detected a large majority were running vulnerable systems that could be hacked with known exploits.
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "builtwith_analysis",
"url": "https://indianexpress.com/article/india/kerala/kerala-rainfall-idukki-landslide-rajamala-pamba-dam-6547424/"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_Key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 192
{
"api_type": "builtwith_analysis",
"url": "https://indianexpress.com/article/india/kerala/kerala-rainfall-idukki-landslide-rajamala-pamba-dam-6547424/"
}
Parameter | Description |
---|---|
api_type | builtwith_analysis |
url | URL of the webpage to be analyzed. You can put the domain homepage here, however, since major news outlets routinely run different Javascript plugins on different website sections, a better option is to input the invidivual news article URL here. |
Output
{
"WordPress": {
"versions": [
"6.0.3"
],
"categories": [
"CMS",
"Blogs"
]
},
"Nginx": {
"versions": [],
"categories": [
"Web servers",
"Reverse proxies"
]
},
"PHP": {
"versions": [],
"categories": [
"Programming languages"
]
},
"Google Sign-in": {
"versions": [],
"categories": [
"Social login"
]
},
"comScore": {
"versions": [],
"categories": [
"Analytics"
]
},
"jQuery": {
"versions": [
"3.6.0",
"3.3.2",
"04072017.2"
],
"categories": [
"JavaScript libraries"
]
},
"Twitter": {
"versions": [],
"categories": [
"Widgets"
]
},
"jQuery Migrate": {
"versions": [
"3.3.2"
],
"categories": [
"JavaScript libraries"
]
},
"Automattic": {
"versions": [],
"categories": [
"PaaS"
]
},
"Slick": {
"versions": [],
"categories": [
"JavaScript libraries"
]
},
"YouTube": {
"versions": [],
"categories": [
"Video players"
]
},
"AMP": {
"versions": [],
"categories": [
"JavaScript frameworks"
]
},
"MySQL": {
"versions": [],
"categories": [
"Databases"
]
},
"Google Plus": {
"versions": [],
"categories": [
"Widgets"
]
}
}
This endpoint returns a JSON object containing the following elements:
Parameter | Description |
---|---|
versions | It mentions software version (if it cannot be detected, than its an empty list) |
categories | The detected software or technology category |
4 - Social Media & Email Addresses API Reference
These group of endpoints are of particular interest to users working in PR & outreach, content marketing tec.
You build your own journalist database by roughly using the workflow below:
-
Using keywords and source_url as input, you can search for latest news articles using either simple news search or Get news headlines by keyword search
-
Alternately, you can get geography based news articles by searching for news using Get Latest News By City, State, Country or getting news by geolocation i.e. latitude & longitude
-
If you only have url_id than use one of our endpoints to get the URL and full_text of the news articles
-
Using the URL as input, try out the following endpoints that can extract email addresses, names and social media handles
4.1 - Extract All Social Media Handles & URLs
This endpoint lets you quickly extract all the major social media handles found on the webpage.
Generally speaking, you will be able to find author’s social media handles apart from the social media handles of the news outlet themselves using this endpoint.
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "extract_social_media_handles",
"url": "https://www.wsj.com/articles/a-powerful-force-teslas-momentum-leads-stock-market-surge-11595151001"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_Key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 192
{
"api_type": "extract_social_media_handles",
"url": "https://www.wsj.com/articles/a-powerful-force-teslas-momentum-leads-stock-market-surge-11595151001"
}
Parameter | Description |
---|---|
api_type | extract_social_media_handles |
url | URL of the webpage |
Output
[
"https://www.instagram.com/wsj/",
"@WSJ",
"https://www.snapchat.com/discover/Wall-Street-Journal/4806310285",
"https://www.youtube.com/user/WSJDigitalNetwork",
"https://twitter.com/WSJ",
"https://www.facebook.com/wsj"
]
4.2 - Extract Name and Spam Score For Each Email Address
We have a vast email database containing over 31 million email addresses. This API endpoint will return full name (if available) along with a label for spam_email and generic_email.
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "extract_name_and_spam_classification_from_email_address",
"email_address": "jay@jaympatel.com"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_Key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 192
{
"api_type": "extract_name_and_spam_classification_from_email_address",
"email_address": "jay@jaympatel.com"
}
Parameter | Description |
---|---|
api_type | extract_name_and_spam_classification_from_email_address |
email_address | Email address |
Output
{
"Response":
{"first_name":"Jay",
"last_name":"Patel",
"domain":"jaympatel.com",
"spam_status":"Negative",
"generic_email":"Negative"
},
"email_address":"jay@jaympatel.com"
}
This endpoint returns a JSON object containing the following elements:
Parameter | Description |
---|---|
first_name | First name (if available, otherwise NA) |
last_name | Last name (if available, otherwise NA) |
spam_status | This returns the spammy behaviour of the email address and it can be “Negative” or “Positive”. Email addresses that are considered spammy will have very low sender reputation |
generic_email | If an email inbox is generic such as info, help, noreply etc. it will return “Positive” otherwise it will return “Negative” |
4.3 - Extract Email Address By URL
Extract email addresses found in any webpage by inputting a URL. This is an excellent way to build a journalist email database.
Once you have email addresses, use our email address to name endpoint to query our database and get full names and a spam score.
import requests
url = "https://specrom-news-api.p.rapidapi.com/"
payload = {
"api_type": "email_extraction",
"url": "https://www.wsj.com/articles/a-powerful-force-teslas-momentum-leads-stock-market-surge-11595151001"
}
headers = {
"content-type": "application/json",
"X-RapidAPI-Key": "API_Key",
"X-RapidAPI-Host": "specrom-news-api.p.rapidapi.com"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
POST / HTTP/1.1
Content-Type: application/json
X-Rapidapi-Key: API_Key
X-Rapidapi-Host: specrom-news-api.p.rapidapi.com
Host: specrom-news-api.p.rapidapi.com
Content-Length: 192
{
"api_type": "email_extraction",
"url": "https://www.wsj.com/articles/a-powerful-force-teslas-momentum-leads-stock-market-surge-11595151001"
}
Parameter | Description |
---|---|
api_type | email_extraction |
url | URL of the webpage where you want to extract email address |
Output
["amrith.ramkumar@wsj.com"]