Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InsertResult / GetObservation reponse time #825

Open
EnocMartinez opened this issue May 25, 2020 · 2 comments
Open

InsertResult / GetObservation reponse time #825

EnocMartinez opened this issue May 25, 2020 · 2 comments

Comments

@EnocMartinez
Copy link

EnocMartinez commented May 25, 2020

Hi,

I'm converting data from a set of CSV files to O&M and Inserting them with InsertResult (using ComplexObservation). I've found that it takes a huge amount of time to insert them to the database. For instance adding 45 minutes of data (about 2700 observations with timestamp, lat, long and 6 other values) takes approximately 2 minutes using a SOS deployed in my desktop computer with an SSD.

GetObservation response time for the same amount of data is also slow, about 12 seconds to retrieve 2700 observations (also using an SSD). If I query directly the database it takes only 8 milliseconds.

If I try to add a day worth of data it will take more than one hour to inject it and about 6.5 minutes to retrieve it. If I need to add / query a month of data it's simply impossible. Is there any way to speed-up injection/retrieval processes? Is there some bug causing slow responses for large time periods?

I've also tried using 6 separated OM_Measurement instead of a ComplexObservation and the results are more or less the same. I'm using version SOS version 4.4.14, postgresql 10.

Attached you can find the files to replicate this issue (InsertSensor, InsertResultTemplate, InsertResult and getObservation) .
transactions.zip

@CarstenHollmann
Copy link

Hi,

in the SOS 4.4.x several extensions resulted in a SQL query being sent to the db for each inserted observation to prevent duplicate data. But this "check" can be disabled via the setting "Should this SOS check for duplicated observations in the REQUEST?" located in the "service" tab. If you can guarantee that no duplicate observation would be insertedm you can disable the flag.
In the new SOS 5.x database model this check is done again by the database constraints.

Querying directly the database is one part of a GetObservation request,therefore this cannot be compared. In the SOS we use the Hibernate framwork to support several database managment systems and the queries (with required joins) are generated under the hood. This leads to other queries that can be a bit more complex as if I query the observation table directly. Furthermore the data from the database must be processed and formatted to the response format (e.g. XML). Depending on the used client/browser to send the GetObservation, the response would be "pretty printed" which also sometimes takes a while.

The injection process can be speed by disabling the flag I mentioned before. For the retrieval of large amounts of data we recommend to use the GetResult operation because this reduces the XML processing of the response and the responses are much smaller.

@EnocMartinez
Copy link
Author

EnocMartinez commented Jun 9, 2020

Hi Carsten,
I've done some more tests with and without the "Should this SOS check for duplicated observations in the REQUEST?" flag. I've tested the versions 4.4.5, 4.4.14 and 5.1.0. All tests are in the same desktop pc with an ssd, transactions are in localhost and each webapp in tomcat has up to 2 GB of RAM (when using a VM it's a lot slower).

SOS Version: 4.4.5
Check Duplicated: ON
InsertResult: 44.32 s
GetObservation: 2.54 s
GetResult: 1.198 s

SOS Version: 4.4.5
Check Duplicated: OFF
InsertResult: 42.87 s
GetObservation: 3.47 s
GetResult: 1.271 s

SOS Version: 4.4.14
Check Duplicated: ON
InsertResult: 1m 6.3 s
GetObservation: 3.47 s
GetResult: 1.2 s

SOS Version: 4.4.14
Check Duplicated: OFF
InsertResult: 41.65 s
GetObservation: 3.35 s
GetResult: 1.208 s

SOS Version: 5.1.0
Check Duplicated: n/a
InsertResult: 9m 0.72 s
GetObservation: 11.16s
GetResult: 6.16 s

For us this is a blocking issue since we are working with datasets that can be several months long (the test datasets is only 25 minutes long). The SOS webapp takes days to inject them to the SOS database and hours to retrieve each dataset, even if we split it into smaller requests.

Is there any other tweak to speed it up? Is it possible to bypass the hibernate framework? Maybe some postgresql settings may be adjusted to squeeze more performance from the database?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants