How To Query MongoDB Documents In Python
Introduction
When you need to find information on a MongoDB document, querying the right way becomes important. You want the right data as fast as possible so you can make the right decisions. There are several ways to find documents MongoDB, however, it’s best to know which one to use in order to save time. For example, you might want to use a multiple-condition query request to find documents PyMongo. Sorting data may be helpful to you as well. Learn these techniques are more in this tutorial that shows you how to query MongoDB documents Python.
Prerequisites
- MongoDB – Verify it is installed and make sure it is still running. To do this, open a terminal window and use command
mongo --version
. Alternatively, while in a terminal window, typemongo
, then press the Return key.
1
|
mongo --version
|
- Python 3 – Confirm that you have it installed and it is running. >Note: Python 2 will soon be obsolete, so download and install Pytyon 3 instead.
- MongoDB Python driver – Install it with the package manager
pip3
.
1
|
pip3 install pymongo
|
Create a Python script directory
Get the environment for the MongoDB server ready to use with Python. Make a directory for the document and its related files.
1
|
sudo mkdir python-mongo
|
>NOTE: For this tutorial, we’ll use a project example and call it
python-mongo
.Make MongoDB class instances after importing PyMongo library
- Import the
MongoClient
PyMongo library - Create new instances
1
2 3 4 |
from pymongo import MongoClient
# A MongoDB instance for Python mongo_client = MongoClient('mongodb://localhost:27017') |
Make a PyMongo MongoDB database instance
- Query documents with the instance. Use it to access the collection and database.
1
2 |
# A database instance
db = mongo_client.some_database |
Make a PyMongo MongoDB collection instance
- The collections for the database, get them ready to query by making an instance.
1
2 |
# A collection instance
col = db.some_collection |
A basic example of a MongoDB collection PyMongo query
- The
find()
method passes a Python dictionary in this example below. An API call uses a Python dictionary and it queries a MongoDB collection’s documents withfind()
.
1
|
result = col.find( {"some field": "FIND ME!"} )
|
- The
result
stored in thepymongo.cursor.Cursor
object returns documents.
Use the regular expression “$regex” to locate documents with a partial string match
- A nested dictionary is what you’ll make to query partial string matched documents. A nested dictionary contains two parts: (1) outer dictionary is the field you’re querying, and (2) inner dictionary key is
"$regex"
.
1
2 |
# A query dictionary object $regex
regex_query = { "field example" : {"$regex" : "PARTIAL STRING MATCH"} } |
- Next, use the
find()
method to pass the nested dictionary.
1
|
result = col.find( regex_query )
|
Use a Python iterator to print each document returned by MongoDB
- You can retrieve all documents if you iterate the
result
object like it’s a list in Python.
1
2 |
for doc in result:
print (doc) |
- Look for a result like this from every document returned:
1
|
{'_id': ObjectId('5ced203bd3c4454072c57040'), 'field 1': 'value', 'field 2': 'value'}
|
Get the values and fields of MongoDB documents
Obtain MongoDB documents fields and values. You must have access to the
_id
key so documents in the pymongo.cursor.Cursor
object can be returned by the iterator.How to access “_id” field of a MongoDB Python document
- Get the iterated object’s key
"_id"
. Then you’ll be able to obtain the_id
of the document.
1
2 3 4 |
# iterate the returned Cursor object
for doc in result: # print the document's _id to terminal print ("doc _id:", doc["_id"]) |
Obtain a complete list of the result’s methods along with attributes with the dict object
- See every attribute and method result from the Cursor object that was returned from the API call.
1
2 |
# the API call's results can show you all of the attributes of the Cursor object
print ("Cursor attr:", result.__dict__) |
Pass the collection’s find() method to the Python list() function to return a list of a MongoDB documents
Pass the entire
collection_object.find()
API call into the list()
function to have it return a list containing all of the collection’s documents (that matched the query) and their respective data.
Here’s an iterator that goes over all of the document dictionary objects that were returned in a list, and it print’s out their respective document
_id
s:
1
2 3 4 5 6 |
# build a Python dictionary for query
query = {"search this field" : "find this value"} documents = list(col.find(query)) for doc in documents: print ("\ndoc _id:", doc["_id"]) |
Obtain the document quantity amount returned after you make the MongoDB API query
- Keep in mind, Python 2 is on its way out, and versions 3.x of MongoDB will return an error message if you try to use the old
count()
method with it. Older versions like those used thecount()
method and it was enough to get the number of documents after afind()
returned a Cursor object.
1
2 3 4 5 |
# a query request result
result = col.find(some_query) # the count() method print ("number of docs:", result.count()) |
- An integer for the amount of documents queried by the API call was accomplished by the
count()
method.
This example shows how the old count() method returns a DepreciationWarning
The Cursor object’s count() method is deprecated since v3.1 of MongoDB
- There are two ways to successfully get the document count. Use the method
count_documents()
and make another call to that collection object or by counting when using the iteratorenumerate
, a Python generator, for the result object.
How to use the count_documents() method
- With the method
count_documents()
, the collection’s instance is where you’ll pass the Python dictionary.
1
2 |
doc_count = col.count_documents(some_query)
print ("doc_count:", doc_count) |
How to iterate and count documents
- Do this in two ways: when you iterate the
result
object that was returned, keep track of the number of documents. Alternatively, theenumerate()
can count the documents.
1
2 3 |
for num, doc in enumerate(result):
print ("num,:", num, "-- _id:", doc["_id"]) print ("total documents:", num) |
Use datetime library in Python to query PyMongo ranges
- Python
datetime
requests are supported by somefind()
method queries.
How datetime objects in PyMongo are utilized
- Strings are the format for datetime objects that PyMongo uses. Next, MongoDB server receives those queries passed from the
datetime
library.
When to import the datetime library
- At the start of the script, import the
datetime
library like this:
1
|
import json, datetime
|
>NOTE: Python’s built-in exception
ValueError
will be raised if you pass incorrect month or day values for the datetime
object. Therefore, don’t pass a month value integer over 12
or day integer over 31
. If you do that by mistake, you’ll know what caused the error and can then fix it.Pass a Python datetime string to MongoDB
- Make a datetime object first with the
datetime.datetime()
method. - Then covert it to a string.
- Next, pass it to MongoDB
Create a new datetime string in Python to pass to the MongoDB request
Use the
datetime.datetime()
method to create a datetime object for the query request to PyMongo’s find()
method. Make sure to explicitly convert the datetime object to string first before passing to a query dictionary:
1
2 3 4 5 |
# use the parsed HTTP data and create a new datetime object
start_date = datetime.datetime(query_year, query_month, query_day) # convert it explicitly to a datestring start_date = str(start_date) |
About the $gte and gt MongoDB query selectors
- The MongoDB query selector for equal to or greater than is
$gte
andgt
is greater than. The query selector is passed into the inner dictionary of a nested Python dictionary.
1
|
query = { "join_date": {"$gt": start_date} }
|
- The example below shows the dictionary query passed into the
find()
method in a direct way:
1
2 |
# call the find() method to make a date range request
result = col.find({"join_date": {"$gt": start_date}}).sort("name") |
- The
sort()
method sorts the documents returned based on a particular field. In the above example, it’s the"name"
field.
How to use the PyMongoMongoDB query operators
In order to use
$and
, $not
, and $or
MongoDB query operators, the following rules apply.- The outer dictionary key must be one of the query operators
$and
,$not
, and$or
. - In addition, dictionary parameters must be in a Python list and that Python list must be the value of the key.
Multiple conditions queries and PyMongo requests
- The
$and
query operator multiple conditions script can be created using several lines or a single line. - Below is a multiple line, mulitple condition query with the
$and
operator:
1
2 3 4 5 6 7 8 9 10 11 |
query = {
"$and": [ { "field 1": "MUST MATCH THIS" }, { "field 2": "..AND THIS!" } ] } |
- Below is the same multiple condition query with just one line:
1
|
query = {'$and': [{'field 1': 'MUST MATCH THIS'}, {'field 2': '..AND THIS!'}]}
|
Multiple condition query and the find() method
- There are no special steps for this one. This type of PyMongo query is passed like the others. See below:
1
|
result = col.find( query ).sort("field 1")
|
Image example of Python IDLE environment making a PyMongo multiple-condition query request using the find() method and $or query operator
Conclusion
This tutorial explained how to query MongoDB Python. You learned how to use the
find()
method to create a query request MongoDB in a collection. You also found out about the $gte
greater than or equal to the operator when using the find()
method to locate documents MongoDB. In addition, you discovered how to import datetime
library to query MongoDB documents Python. We went over multiple-condition querying and sorting returned results. There’s much more that we uncovered in this tutorial that should help you in your current and upcoming MongoDB projects.
For further reference, turn to the examples shown below for querying MongoDB documents in a Python script.
For further reference, turn to the examples shown below for querying MongoDB documents in a Python script.
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
#!/usr/bin/env python3
#-*- coding: utf-8 -*- # import the MongoClient class from pymongo import MongoClient # build a new client instance for MongoDB mongo_client = MongoClient('localhost', 27017) # get the employees database db = mongo_client['employees'] # get the newly hired people col = db['new_hires'] """ LOGICAL OPERATORS FOR MULTIPLE QUERY CONDITIONS """ # find any document of an employee who is a male AND is 26 multiple_param = { "$and": [ {"sex": "male"}, {"age": "26"}]} # find any document of an employee who is a female and is NOT 26 multiple_param = { "$not": [ {"sex": "female"}, {"age": "22"}]} # find any document of an employee who is male OR is 25 years old multiple_param = { "$or": [ {"sex": "male"}, {"age": "25"}]} # call the find() method to make a query request and sort order result = col.find( multiple_param ).sort("sex") # get all of the attributes of the Cursor object returned by API print ("Cursor attr:", result.__dict__, "\n\n") # iterate the result Cursor object with documents for num, doc in enumerate(result): print (num, "--", doc, "\n") """ DATE RANGE QUERY FROM HTTP REQUEST MESSAGE PARAMETERS """ # import the JSON and datetime libraries import json, datetime # simulate an HTTP POST request string http_request_post = '{"user_query": {"year": 2015, "month": 4, "day": 12}}' # convert the HTTP message into a JSON object json_date = json.loads(http_request_post)["user_query"] # parse out the year, month, and day from the JSON object query_year = json_date["year"] query_month = json_date["month"] query_day = json_date["day"] # create a new datetime() object from the parsed HTTP data start_date = datetime.datetime(query_year, query_month, query_day) # you have to explicitly cast the datetime object as a string # to convert the object to an actual datestring start_date = str(start_date) # call the find() method to make a date range request # "$gt" means "greater than" result = col.find({"join_date": {"$gt": start_date}}).sort("name") # iterate over the result Cursor object with enumerate() for num, doc in enumerate(result): print (num, "--", doc, "\n") """ ITERATE OVER THE DOCUMENTS RETURNED BY A QUERY """ # get a MongoDB database instance db = mongo_client['some_database'] # get a collection instance from the database col = db['some_collection'] # declare a new dictionary for the query body some_query = {"field to search" : "MUST MATCH THIS"} # use Python's list() function to return the # Cursor object's list of MongoDB documents documents = list(col.find( some_query )) # iterate over the document dictionaries in the list for doc in documents: # access each document's "_id" key print ("\ndoc _id:", doc["_id"]) # print the length of the returned list print ("total documents found:", len(documents)) |
No comments:
Post a Comment