List of notion.so python API libraries 2022-04
My use case is simple; I have a table page (or in notion terminology, a database) defined in my notion.so namespace. I have set up an integration token, which can access this page. And I need to iterate over all the items in the table.
My definition of a good job is:
Provide object-oriented API for the calls. So that I can for example reference the table-page by the ID and then iterate over it like this:
from imaginary_notion_api import Client
notion = Client("secret_token")
changelog_db = notion.Database("DATABASE_ID")
for row in changelog_db:
print(row.updated, row.title, row.text)
Maybe also add new values by calling .append()
and other native python operations, like square bracket access and so on. I want to avoid being bothered with lowlevel operations, the fact that for listing the database, you have to do an empty query, or that the API returns some kind of JSON.
Basically, what I want is an API which has a high-enough level of abstraction which corresponds with the things you want to do, on the object level, and with python’s syntax constructs (context managers, iterator API, and so on). I am not really interested in thin wrapper over the API which returns JSONs that you need to parse separately.
notion-sdk
- Status: 48 commits, last update 3 months ago.
I’ve tried to use it because at first glance, it seems like it works. Except it doesn’t. It does a good job at defining all the types returned from the Notion API, but that’s pretty much it.
from notion import NotionClient
notion = NotionClient(auth="secret_token")
changelog_db = notion.databases.retrieve("DATABASE_ID")
print(changelog_db.title)
for item in changelog_db:
print(item)
Great. I get back the Database object, which has no methods, and I can’t figure how to iterate over values. Looking into the code itself doesn’t help. There seem to be no methods.
Documentation is missing:
And the examples/
directory is similarly useless:
Edit: Okay, so I figured it out by reading notion API documentation. You have to query the database with the empty query, to really get the content. Sadly, when I try it, I get the very, very long traceback full of:
Traceback (most recent call last):
File "/home/bystrousak/Plocha/xlit/notion_blog_generator/lib/preprocessors/apitest.py", line 14, in <module>
changelog_db = notion.databases.query("DATABASE_ID")
File "/home/bystrousak/Plocha/xlit/notion_blog_generator/venv/lib/python3.8/site-packages/notion/endpoints/sync.py", line 94, in query
return PaginatedList[Page].parse_obj(
File "pydantic/main.py", line 511, in pydantic.main.BaseModel.parse_obj
File "pydantic/main.py", line 331, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1326 validation errors for PaginatedList[Page]
results -> 0 -> properties -> Updated -> type
unexpected value; permitted: <PropertyValueType.TITLE: 'title'> (type=value_error.const; given=date; permitted=[<PropertyValueType.TITLE: 'title'>])
results -> 0 -> properties -> Updated -> title
field required (type=value_error.missing)
results -> 0 -> properties -> Updated -> type
unexpected value; permitted: <PropertyValueType.RICH_TEXT: 'rich_text'> (type=value_error.const; given=date; permitted=[<PropertyValueType.RICH_TEXT: 'rich_text'>])
results -> 0 -> properties -> Updated -> rich_text
Which continues like this for several hundred of lines. Typing, huh?
.next()
notion-py
- Status: 233 commits, last update 15 months ago.
This looks great, it seems to support all the features I need. The only problem is, that it uses unofficial API. This basically means, that they reverse-engineered the API calls notion itself does, and implemented it by writing a custom wrapper.
Before notion published API, this was the only way how to do it, and I’ve used it in the past myself. But this has two serious problems:
- Internal API can and do change.
- You have to use token from the browser. And that changes every month or so.
Especially the second point is annoying. Every time notion logs you out, you have to go to developer console in your browser and copy the token from the Cookie. Which is the extra manual step you don’t want to do in your automation.
.next()
notion-database
- Status: 23 commits, last update 20 days ago.
From the README and the name, you could expect that this will be precisely what I am looking for, right?
The code looks straightforward:
from notion_database.database import Database
database = Database(
integrations_token="secret_token"
)
database.retrieve_database(
database_id="DATABASE_ID", get_properties=True
)
print(database.properties_list)
Except it prints only the names of the column and internal ID’s:
[{'id': 'LE~%7D', 'name': 'Updated', 'type': 'date', 'date': {}}, {'id': 'YtoI', 'name': 'Title', 'type': 'rich_text', 'rich_text': {}}]
Well. Okay, so how do I iterate over the actual values? Documentation doesn’t share this secret knowledge. And I can’t see it from the debugger:
dir(database)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'create_database', 'find_all_page', 'list_databases', 'properties_list', 'query_database', 'request', 'result', 'retrieve_database', 'run_query_database', 'update_database', 'url']
Should I query the database with the ID’s from the properties? I am looking into the code itself, and it doesn’t look like a promising approach:
def query_database(self):
# Not Implemented
pass
Am I missing something? I’ve tried different methods, and they return basically nothing of any value.
I look into the documentation, and then I see it:
from notion_database.database import Database
database = Database(
integrations_token="secret_token"
)
database.find_all_page(database_id="DATABASE_ID")
print(database.result)
This seems to return some kind of json / dict with all the rows. Finally!
But at this point, I could construct the query myself in like ten lines of code. It appears that the library is mostly useless because all I get is JSON and not python objects. I mean, I could have done the same thing with the code straight from the documentation, and it would be only slightly longer and functionally the same:
import requests
url = "https://api.notion.com/v1/databases/DATABASE_ID/query"
payload = {"page_size": 100}
headers = {
"Accept": "application/json",
"Notion-Version": "2022-02-22",
"Content-Type": "application/json",
"Authorization": "Bearer secret_token"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.text)
.next()
notion-sdk-py
- Status: 139 commits, last update 2 months ago.
This seems like an actively developed project, but it has the same problem as the previous one: it provides only a thin wrapper over the abstractions. You can query the values using python API, but you’ll get JSON / dict back.
.next()
Other libraries
Uh oh. It seems like we’ve got to the end. When I search the GitHub, I can see that there are several other projects, but mostly in worse shape than the ones already discussed:
-
https://github.com/ngviethoang/notiondb (status: 3 commits, last 17 days ago)
- Similar to the notion-database, also really lowlevel abstraction.
-
https://github.com/jacksalici/notion-cli-list-manager (status: 57 commits, last 3 months ago)
- Commandline tool, which does exactly what I want. Not a python library.
-
https://github.com/bradleyhurley/PyNotion (status: 5 commits, last 8 months ago)
- Unfinished project, looks like it works with unofficial API.
-
https://github.com/lastorel/pytion (status: 62 commits, last 6 days ago)
- It seems like an interesting candidate for an API library, but so far in the development.
- There is no package, no documentation, and everything seems like it is being worked on.
-
https://github.com/visheshdvivedi/Notion-API-Python (status: 9 commits, last 11 months ago)
- Some kind of data dumper.
-
https://github.com/avlm/zotion (status: 2 commits, last 11 months ago)
- Abandoned project.
-
https://github.com/bustrama/Notion-API (status: 4 commits, last 11 months ago)
- Another abandoned project.
-
https://github.com/thomashirtz/notion-utilities (status: 17 commits, last 27 days ago)
- Some kind of utilities which make some transformations, not really an API.
Conclusion
It seems to me, that the situation is not improving. I've done the same “research” approximately a year ago, concluded that there isn’t any good library, and hoped, that something would appear since then. It didn’t.
Now I have to think about what to do:
- Use lowlevel API, work with JSONs.
- Wait longer.
- Create something myself.
The problem is, that I don’t really want to do any of this. *Sigh*.
I mean, I can probably create something, but I don’t actually want to maintain it in the long term. It could be partially fixed by using some kind of code generators, but that seems fragile, and it will break all the time. I’ve look and there doesn’t seem to be any kind of formal specification, like swagger definition, or something like that.
Sure, I can use lowlevel API, but this has its own issues. For this use case, it is fine, but when I'll make something more complicated, the result will be messy code, which is pretty badly maintainable.
Edit
Eventually, I’ve used raw requests
calls to get the job done, but it wasn’t a pleasant experience. Especially to have to dig the data from the four or five level deep nested JSON arrays and objects is really annoying and unreadable.
At the moment, I am trying to create proof-of-concept experimental library for notion API, but I am not sure if I’ll continue to develop it. Mostly I wanted to see how hard it would be to create it with some auto-generation of types and so on.