SMC
Version: 1.1.0
 
Login

SMC API Help

Introduction

The SMC website has a backend API that users can use to query the data in the website. The backend API documentation shows all of the APIs and examples of how to use them. SMC uses OpenAPI to document the APIs: https://smc.jgi.doe.gov/docs

Under each section is the APIs available for use. Clicking "GET", "POST", "DELETE" or "PUT" shows an example of how to use the API. Use the "Try it out" button to test the API.
  • GET APIs return data from the SMC database.
  • POST APIs update data in the SMC database.
  • DELETE APIs remove data from the SMC database.
  • PUT APIs insert new data into the SMC database.
Some POST APIs require your user token to be used as part of the API. Use the bash "curl" command to try from the terminal or use the Python "requests" library to use the API with Python.

Examples

Example sections show BASH terminal sessions (black background) or Python code (green background) to show the example code. Use the Contact Us form if the examples do not work for you.

BASH terminal example:

  # Comments start with "#" and do not need to be run in the terminal.

  # This is an example to run in your bash shell terminal.
  # Commands start with "$" and the output is shown after the command, e.g. for the date command:
  $ date
  Mon Apr  1 11:11:14 PDT 2024

  # use the curl command to do a simple call to the SMC API & return JSON
  $ curl https://smc.jgi.doe.gov/v2/msg/hello
  {"msg_time":"2024-04-03 11:35:56","msg":"hello","msg_ip":"172.71.154.171"}
            


Python code example:

  #!/usr/bin/env python
  import math

  # comment - calculate the area of a circle
  def area_of_circle(r):
      return math.pi * r * r
            


API Data Format

When APIs are called, the data comes back in JSON format. JSON format is easy for Python to interpret. The JSON data structure contains:
  • "data": the data from the API, can return JSON or a list
  • "error": messages about errors, "null" if no errors
  • "has_next": returns "true" or "false". "true" if more data. Use the paging token to get the next set of results. (optional)
  • "total_count": total number of results in the list (optional)
  • "token": an encrypted string containing information to get the next set of results (optional)

Example using the "GET" /v2/source-data/{source_data_id} api in a terminal session:

# get source data from from the source_data API using id 58411
$ curl https://smc.jgi.doe.gov/v2/source-data/58411
{"source_data":{"source_id":3210,"size_bp":5892123,"gc_count":35.18,"source_data_type_id":1,"dna_sequence_file_id":22695,"gff_file_id":22696,"taxonomy_id":718224,"accession_id":"2537562224", ...}}
            

The JSON returned is not very human readable. You can use python's json module to make it more readable with "$ curl (command) | python -m json.tool" or using jq: "$ curl (command) | jq ."

User Tokens

When you sign into the SMC website with the "LOGIN" link on the top left of the screen, SMC creates a user token for you. The user token expires after 12 hours. After the token expires, a new one will be created when you visit the SMC website again. The user token is needed for some protected APIs. The user tokens in the examples (HHGTTG42) is just an example, the user token used will be similar.

The user token can be found under the user dropdown when you hover over your name in the top left corner. Select the "copy token to clipboard" option.

An example showing how to use the user token with the "/v2/users/me" API.

# set the SMC_USER_TOKEN variable to store the user token
$ export SMC_USER_TOKEN="HHGTTG42"
$ curl -X 'GET' \
"https://smc.jgi.doe.gov/v2/users/me" \
  -H 'accept: application/json' \
  -H "Authorization: Bearer $SMC_USER_TOKEN" | jq .
  {
    "dt_join": "Feb 03, 2022 02:22:36 pm",
    "dt_login": "Apr 03, 2024 11:44:20 am",
    "first_name": "Donnie",
    "last_name": "Baker",
    "email": "[email protected]",
    "role_name": "regular",
    "token_exp_sec": 21868.049319
  }
            


Python code example for the same request:

#!/usr/bin/env python

import requests
import json

# setup variables to use
smc_url = "https://smc.jgi.doe.gov" # SMC website
# smc user token from SMC website, need to refresh every 48 hours
smc_user_token = "HHGTTG42" # normally longer
# setup smc_header to send the smc_user_token
smc_header = {"Authorization": f"Bearer {smc_user_token}", "Content-Type": "application/json"}

# wait up to 30 seconds to get a response from the SMC backend API
timeout_sec = 30

# put together the API call
smc_api = f"{smc_url}/v2/users/me"
print(f"GET {smc_api}")

# use the requests library to call the API
r = requests.get(smc_api, headers=smc_header, timeout=timeout_sec)

# use the json library to load the response into the smc_data json variable
smc_data = json.loads(r.text)

# print results
print("User email: {}, name: {} {}".format(smc_data['data'].get("email"), smc_data['data'].get("first_name"), smc_data['data'].get("last_name")))
          


Search Example

The search APIs use json to post the search parameters including the search string.
Simple curl call to search for "Escherichia Coli" using the source-name search.

$ curl https://smc.jgi.doe.gov/v2/search/source-name \
-d '{"search": "Escherichia Coli", "page_size": 5}' \
-H 'Content-Type: application/json'  | jq .
{
  "total_count": 1000,
  "search_results": [
    {
      "smc_id": 5713,
      "tax_name": "Escherichia coli BIDMC 20B",
      "bgc_count": 13
    },
    {
      "smc_id": 5636,
      "tax_name": "Escherichia coli BIDMC 20A",
      "bgc_count": 13
    },
    {
      "smc_id": 5618,
      "tax_name": "Escherichia coli O104:H4 str. Ec11-9450",
      "bgc_count": 14
    },
    {
      "smc_id": 5617,
      "tax_name": "Escherichia coli O104:H4 str. Ec11-9941",
      "bgc_count": 14
    },
    {
      "smc_id": 5514,
      "tax_name": "Escherichia coli BIDMC 38",
      "bgc_count": 13
    }
  ],
  "token": "eyJzZWFyY2giOiAiRXNjaGVyaWNoaWEgQ29saSIsICJwYWdlX3NpemUiOiA1LCAic29ydCI6ICJkZXNjIiwgInNvcnRfZmllbGQiOiAic21jX2lkIiwgImxhc3RfaWQiOiA1NTE0fQ=="
}
            

The results show the first 5 results from the search query. The "token" is used for paging through the results. If there are no more results then the "token" will not be included in the json results. Each page has a different "token" value. Only the token field is needed to continue paging through the search result example from above. The "total_count" is the remaining records.

$ curl https://smc.jgi.doe.gov/v2/search/source-name \
-d '{"token": "eyJzZWFyY2giOiAiRXNjaGVyaWNoaWEgQ29saSIsICJwYWdlX3NpemUiOiA1LCAic29ydCI6ICJkZXNjIiwgInNvcnRfZmllbGQiOiAic21jX2lkIiwgImxhc3RfaWQiOiA1NTE0fQ=="}' \
-H 'Content-Type: application/json'  | jq .
{
  "total_count": 995,
  "search_results": [
    {
      "smc_id": 5449,
      "tax_name": "Escherichia coli EC096/10",
      "bgc_count": 9
    },
    {
      "smc_id": 5426,
      "tax_name": "Escherichia coli UMEA 3489-1",
      "bgc_count": 11
    },
    {
      "smc_id": 5425,
      "tax_name": "Escherichia coli UMEA 3426-1",
      "bgc_count": 11
    },
    {
      "smc_id": 5424,
      "tax_name": "Escherichia coli UMEA 3693-1",
      "bgc_count": 15
    },
    {
      "smc_id": 5423,
      "tax_name": "Escherichia coli UMEA 3342-1",
      "bgc_count": 17
    }
  ],
  "token": "eyJzZWFyY2giOiAiRXNjaGVyaWNoaWEgQ29saSIsICJwYWdlX3NpemUiOiA1LCAic29ydCI6ICJkZXNjIiwgInNvcnRfZmllbGQiOiAic21jX2lkIiwgImxhc3RfaWQiOiA1NDIzfQ=="
}
            

Paging Example with Python

Same search for "Escherichia Coli" using the /v2/search/source-name API. Returns the first 3 pages of 5 result each.
#!/usr/bin/env python

import requests
import json

# setup variables to use
search = "Escherichia Coli" # our search string
page_size = 5 # number of results to return from a request
smc_url = "https://smc.jgi.doe.gov" # SMC Website
smc_header = {"Content-Type": "application/json"} # header for getting the correct content-type
timeout_sec = 30 # wait up to 30 seconds to get a response from the SMC API
smc_api = f"{smc_url}/v2/search/source-name" # API to use

# create json data with search parameters
search_json = {
    "search": search,
    "page_size": page_size,
    "sort": "asc" # sort by SMC-ID ascending instead of the default descending
}

pagination_token = "" # returned from the API call

done = False # have we gotten all of the results?
request_cnt = 0 # number of requests to the API
page_total = 0 # number of results returned

while not done:
    request_cnt = request_cnt + 1

    if pagination_token:
        search_json = { "token": pagination_token }

    # use the requests library to call the API
    print(f"POST {smc_api}")
    r = requests.post(smc_api, json=search_json, headers=smc_header, timeout=timeout_sec)

    # use the json library to load the response into the smc_data json variable
    smc_data = json.loads(r.text)

    # print results
    page_total = page_total + page_size
    print("Results: {page_total} / {total}".format(page_total=page_total, total=smc_data.get("total_count")))
    for source_data in smc_data.get("search_results", []):
        print("SMC ID: {}, Tax Name: {}, BGC Count: {}".format(source_data.get("smc_id"), source_data.get("tax_name"), source_data.get("bgc_count")))

    # get the token
    pagination_token = smc_data.get("token")

    print()

    if not pagination_token:
        done = True

    # just get the first 3 pages of results
    if request_cnt >= 3:
        done = True
            
Example output:
POST https://smc-dev.jgi.lbl.gov/v2/search/source-name
Results: 5 / 1000
SMC ID: 2, Tax Name: Escherichia coli str. K-12 substr. MG1655, BGC Count: 8
SMC ID: 65, Tax Name: Escherichia coli CFT073, BGC Count: 12
SMC ID: 112, Tax Name: Escherichia coli O157:H7 str. Sakai, BGC Count: 12
SMC ID: 140, Tax Name: Escherichia coli BL21(DE3), BGC Count: 10
SMC ID: 171, Tax Name: Escherichia coli str. K-12 substr. W3110, BGC Count: 8

POST https://smc-dev.jgi.lbl.gov/v2/search/source-name
Results: 10 / 995
SMC ID: 177, Tax Name: Escherichia coli SE11, BGC Count: 10
SMC ID: 181, Tax Name: Escherichia coli SE15, BGC Count: 10
SMC ID: 192, Tax Name: Escherichia coli O103:H2 str. 12009, BGC Count: 11
SMC ID: 193, Tax Name: Escherichia coli O111:H- str. 11128, BGC Count: 14
SMC ID: 287, Tax Name: Escherichia coli UTI89, BGC Count: 14

POST https://smc-dev.jgi.lbl.gov/v2/search/source-name
Results: 15 / 990
SMC ID: 289, Tax Name: Escherichia coli 536, BGC Count: 12
SMC ID: 348, Tax Name: Escherichia coli APEC O1, BGC Count: 11
SMC ID: 443, Tax Name: Escherichia coli O139:H28 str. E24377A, BGC Count: 10
SMC ID: 444, Tax Name: Escherichia coli HS, BGC Count: 10
SMC ID: 452, Tax Name: Escherichia coli B str. REL606, BGC Count: 11
            
For APIs that need a user token, add the "Authorization" key to the smc_header dictionary.
smc_header = {"Authorization": f"Bearer {smc_user_token}", "Content-Type": "application/json"}


Please use the Contact Us option on the website for questions or clarification.