Extend PostgreSQL by index structure, data types, search types, etc. With Java?

I found out that PostgreSQL is written in C. I would like to extend it to

  • customized index structure
  • customizable nearest neighbor search (with various distance functions)
  • custom data types

Until now, I was afraid to use PostgreSQL because it is written in C. However, I saw on the PostgreSQL page ( http://www.postgresql.org/about/ ) that they support "library interfaces", for example, for Java. Can I, therefore, use Java to implement (at least) the search for the nearest neighbor and user data types (I think that it is not an index structure, since it is rather low-level)?

+4
source share
1 answer

The answer here is "it's complicated." You can really go pretty far with procedural language (including pl / java), but you will never get enough flexibility that you can get with C. What is fundamentally missing is the ability to properly support indexing in PL / Java, because one does not can create new primitives. A little more, you can see my blog , although most examples are given in the pl / pgsql file.

Types

Now you can go very far with PL / Java (or PL / Perl, or PL / Python, or whatever you like), but there are some things that will not be available. This is also a very high overview of what is possible with procedural language in db and what is not.

There are two effective ways to work with types in procedural languages. You can work with domains (subtypes of primitives) or work with complex types (objects with properties, each of which is a different type, or primitive, or the domain itself, or a complex type). In general, you cannot really do much in terms of indexing complex types, but you can index their members. Another thing that is unsafe is formatting the output, but you can provide other functions to replace this.

For example, suppose we want to have a type for storing PNG files and processing them for certain properties in a database. We could do it as follows:

CREATE DOMAIN png_image as bytea check value like [magic number goes here]; 

Then we could create a bunch of stored procedures for handling png in various ways. For example, we can search for orange near the top in the is_sunset function. We could do something like:

 SELECT name FROM landmark l JOIN landmark s ON (s.name = 'San Diego City hall' and ST_DISTANCE(l.coords, s.coords) < '20') WHERE is_sunset(photo) ORDER BY name; 

There is no reason why is_sunset cannot be processed in Java, Perl or any other language. Since is_sunset returns a bool, we could even:

 CREATE INDEX l_name_sunset_idx ON landmark (name) where is_sunset(photo); 

This will speed up the query, allowing us to cache the index of sunset photo names.

What you cannot do in Java is to create new primitive types. Keep in mind that things like index support are at a primitive level, and therefore you cannot, for example, create a new type of ip address that supports GiST indexing (not what you need, since ip4r is available).

Thus, to the extent that you can reuse and work with which primitives already exist, you can do your development in Java or whatever you like. You are really limited only by what primitives are available, and enough people have written new ones in C, you may not need to touch them at all.

Indices

Index code is pretty much C, like primitives. You cannot customize index behavior in procedural language. What you can do is work with primitives of other developers and so on. This is the area where you will most likely have to switch to C.

(Update: since I am thinking about this, it is possible to connect to existing types of indexes to add support for various indexes based on other PL functions using the CREATE OPERATOR CLASS and CREATE OPERATOR commands. I have no experience with this though.)

Performance

remember that PL / Java means that you run the JVM in every backend process. In many cases, if you can do what you want to do in pl / pgsql, you will get better performance. The same goes for other languages, because you need an interpreter or other environment in the backend process.

+5
source

All Articles