Postgres is cursed for only allowing 65535 parameters in a single query?
Someone correct me if I am wrong, but that is a fairly large number (I think Microsoft SQL is limited to 2000 or something like that) AND this seems like a terrible design pattern.
I learned this one the hard way when trying to query GeoJson data, and trying to get specific, constrained, data about specific features within an area. Excluding features the user doesn’t have access to.
PostgresSQL has a limit of 65,535 parameters, so bulk inserts can fail with large datasets.
Hmm. I would believe that there are efficiency gains from doing one large insert rather than many small — like, there are probably optimizations one can take advantage of in rebuilding indexes — and it’d be nice for database users to have a way to leverage that.
On the other hand, I can also believe that DBMSes might hold locks while running a query, and permitting unbounded (or very large) size and complexity queries might create problems for concurrent users, as a lock might be held for a long time.
EDIT: Hmm. Lock granularity probably isn’t the issue:
One way to speed things up is to explicitly perform multiple inserts or copy’s within a transaction (say 1000). Postgres’s default behavior is to commit after each statement, so by batching the commits, you can avoid some overhead. As the guide in Daniel’s answer says, you may have to disable autocommit for this to work. Also note the comment at the bottom that suggests increasing the size of the wal_buffers to 16 MB may also help.
is worth mentioning that the limit for how many inserts/copies you can add to the same transaction is likely much higher than anything you’ll attempt. You could add millions and millions of rows within the same transaction and not run into problems.
Any lock granularity issues would also apply to transactions.
Might be concerns about how the query-processing code scales.
I’d say running up against a 16bit number for a database import in 2025 is a little cursed. MS is special, still has a 260 path character limit (albiet soft now) in Windows.
Also with more phones taking an image and a video that is only 32767 snaps, which is probably a regular headache for initial imports.
I mean it surprised me, but there are many ways around that. May be less efficient, but you can always use string-to-array, or json, or copy more for CTE then work with inputs as a table.
Based on old memories since I’ve been working in mongo lately, after making the UDT on the db side, you make a data table that has the same name, namespace (ie dbo/public), and the same schema as the UDT (better if that could be generated) and populate it in code. Then you execute the db query with the UDT type as a parameter.
This is better for a few reasons, including not building up a string, but also having the same text means that each query didn’t need to be re-parsed and can reuse execution plans. If the query text isn’t an exact match, it gets that whole pipeline each time.
Postgres is cursed for only allowing 65535 parameters in a single query?
Someone correct me if I am wrong, but that is a fairly large number (I think Microsoft SQL is limited to 2000 or something like that) AND this seems like a terrible design pattern.
I learned this one the hard way when trying to query GeoJson data, and trying to get specific, constrained, data about specific features within an area. Excluding features the user doesn’t have access to.
Sometimes this got up to 65k features.
I definitely could see geojson getting that large.
goes looking for the issue
Hmm. I would believe that there are efficiency gains from doing one large insert rather than many small — like, there are probably optimizations one can take advantage of in rebuilding indexes — and it’d be nice for database users to have a way to leverage that.
On the other hand, I can also believe that DBMSes might hold locks while running a query, and permitting unbounded (or very large) size and complexity queries might create problems for concurrent users, as a lock might be held for a long time.
EDIT: Hmm. Lock granularity probably isn’t the issue:
https://stackoverflow.com/questions/758945/whats-the-fastest-way-to-do-a-bulk-insert-into-postgres
Any lock granularity issues would also apply to transactions.
Might be concerns about how the query-processing code scales.
I’d say running up against a 16bit number for a database import in 2025 is a little cursed. MS is special, still has a 260 path character limit (albiet soft now) in Windows.
Also with more phones taking an image and a video that is only 32767 snaps, which is probably a regular headache for initial imports.
I learned that not too long ago, too.
I mean it surprised me, but there are many ways around that. May be less efficient, but you can always use string-to-array, or json, or copy more for CTE then work with inputs as a table.
Create a user defined table type and use that as a parameter. I’m not sure what the postgres name of that is.
And how do you put data into the table?
Based on old memories since I’ve been working in mongo lately, after making the UDT on the db side, you make a data table that has the same name, namespace (ie dbo/public), and the same schema as the UDT (better if that could be generated) and populate it in code. Then you execute the db query with the UDT type as a parameter.
This is better for a few reasons, including not building up a string, but also having the same text means that each query didn’t need to be re-parsed and can reuse execution plans. If the query text isn’t an exact match, it gets that whole pipeline each time.