Skip to contents
loading...

The tbl_mongo object is a lazy connection to a MongoDB collection that you can use with dbplyr and dplyr verbs. The query in MongoDB JSON language is computed on collect()ing the results, or by using collapse() to retrieve the JSON command in a character string.

Usage

tbl_mongo(
  collection = "test",
  db = "test",
  url = "mongodb://localhost",
  mongo = NULL,
  schema = attr(mongo, "schema"),
  max_scan = 100L,
  ...,
  path = getOption("mongotranslate.path")
)

# S3 method for tbl_mongo
print(x, ...)

# S3 method for tbl_mongo
collapse(x, keep.names = FALSE, ...)

# S3 method for tbl_mongo
collect(x, keep.names = FALSE, ...)

# S3 method for mongo_query
print(x, sql = FALSE, ...)

Arguments

collection

The collection to use in the MongoDB database.

db

The database to use from the MongoDB server.

url

The URL to the MongoDB server. This uses mongo() from mongolite internally, see the documentation at https://jeroen.github.io/mongolite/connecting-to-mongodb.html.

mongo

A mongo connection to a MongoDB collection. If provided, it supersedes collection=, db= and url= that may not be provided (or a warning is issued).

schema

A schema for this collection as calculated by the MongoDB BI app "mongodrdl", in a mongo_schema object from mongo_schema().

max_scan

The maximum number of documents to scan in the collection in order to infer the corresponding schema with mongodrdl (100 by default).

...

More parameters to mongo() to connect to the MongoDB server.

path

The path to the mongotranslate and mongodrdl software. Can be set via options(mongotranslate.path = ....), or left empty if these executables are on the search path.

x

A tbl_mongo or a mongo_query object as obtained with collapse().

keep.names

Logical (FALSE by default). Should the (strange) names constructed by dbplyr be kept in the JSON MongoDB query or not?

sql

Should the corresponding SQL statement be printed as well as the JSON query (FALSE by default?

Value

A tbl_mongo object that contains the logic to process queries on a MongoDB collection through dplyr verbs. collect() returns a data.frame with the result from querying the MongoDB collection. collapse() returns the MongoDB JSON query corresponding to the process in a mongo_query

object.

Examples

if (FALSE) {
# We use the same little MongoDB server with mtcars set up for {mongolite}
# Note that mongotranslate and mongodrdl must be installed and accessible
# see vignette("mongoplyr").
library(mongoplyr)
library(dplyr)
database <- "test"
collection <- "mtcars"
mongodb_url <- "mongodb+srv://readwrite:test@cluster0-84vdt.mongodb.net"

# Connect and make sure the collection contains the mtcars dataset
mcon <- mongolite::mongo(collection, database, mongodb_url)
mcon$drop()
mcon$insert(mtcars)

# Create a lazy mongo object with this connection
tbl <- tbl_mongo(mongo = mcon)

# Create a mongodb query
tbl2 <- tbl |>
  filter(mpg < 20) |>
  select(mpg, cyl, hp)
tbl2
# Use collect() to get the result
collect(tbl2)
# Use collapse() to get the JSON query
(query <- collapse(tbl2))
# Use this JSON query directly in mongolite
# Note, the connection is available as tbl2$mongo here but you do not
# need {mongoplyr} any more and can use mongolite::mongo()$find() instead
mcon$aggregate(query)  # or attr(tbl2, 'mongo')$aggregate(query)

# A more complex exemple with summarise by group
# Note: currently, names must be fun_var in summarise()
query2 <- tbl |>
  select(mpg, cyl, hp) |>
  group_by(cyl) |>
  summarise(
    mean_mpg = mean(mpg, na.rm = TRUE), sd_mpg = sd(mpg, na.rm = TRUE),
    mean_hp  = mean(hp, na.rm = TRUE),  sd_hp  = sd(hp, na.rm = TRUE)) |>
    collapse()
query2
mcon$aggregate(query2)
mcon$disconnect()
}