I need to delete selected values from a graphite whisper data set. It is possible to overwrite a single value just by sending a new value, or to delete the whole set by deleting the .wsp file, but what I need to do is delete just one (or several) selected values, ie reset them to the same state as if they had not been written (undefined, graphite returns nulls). Overwriting doesn't do that.
How to do it? (Programmatically is ok)
See also:
How to cleanup the graphite whisper's data?
Removing spikes from Graphite due to erroneous data
Graphite (whisper) usually ships with whisper-update utility
You can use it to modify the content of a wsp file:
whisper-update.py [options] path timestamp:value [timestamp:value]*
If the timestamp you want to modify is recent (as defined by carbon), you may want to wait or shutdown your carbon-cache daemons.
Related
OpenTSDB by default support up to 8 tags in a database (see here) and you can modify this in configuration. Since CnosDB is adopting similar tag-set model, is there any limit on the number of tags we can create in a single database here then?
I assume the limitation (if exists) is there to avoid the high series cardinality, which could lead to OOM, right?
By default, there are some settings about tags and series in config file.
max-series-per-database = 1000000
The maximum number of series allowed per database before writes are dropped. The default setting is 1000000. Change the setting to 0 to allow an unlimited number of series per database.
If a point causes the number of series in a database to exceed max-series-per-database, CnosDB will not write the point, and it returns error.
max-values-per-tag = 100000
The maximum number of tag values allowed per tag key. The default value is 100000. Change the setting to 0 to allow an unlimited number of tag values per tag key. If a tag value causes the number of tag values of a tag key to exceed max-values-per-tag, then CnosDB will not write the point, and it returns a partial write error.
Any existing tag keys with tag values that exceed max-values-per-tag will continue to accept writes, but writes that create a new tag value will fail.
The more tags you have, the more series you have. By default, you can't create unlimited series in a CnosDB instance. If you want to write more series and tags to a CnosDB instance, you can edit the config file and use the command cnosdb --config /config/file/directory to import the config file. Then restart the CnosDB instance, the new config file will go into effect.
Click here for more information about the config file of CnosDB.
Is there a way to download more than 100MB of data from Snowflake into excel or csv?
I'm able to download up to 100MB through the UI, clicking the 'download or view results button'
You'll need to consider using what we call "unload", a.k.a. COPY INTO LOCATION
which is documented here:
https://docs.snowflake.net/manuals/sql-reference/sql/copy-into-location.html
Other options might be to use a different type of client (python script or similar).
I hope this helps...Rich
.....EDITS AS FOLLOWS....
Using the unload (COPY INTO LOCATION) isn't quite as overwhelming as it may appear to be, and if you can use the snowSQL client (instead of the webUI) you can "grab" the files from what we call an "INTERNAL STAGE" fairly easily, example as follows.
CREATE TEMPORARY STAGE my_temp_stage;
COPY INTO #my_temp_stage/output_filex
FROM (select * FROM databaseNameHere.SchemaNameHere.tableNameHere)
FILE_FORMAT = (
TYPE='CSV'
COMPRESSION=GZIP
FIELD_DELIMITER=','
ESCAPE=NONE
ESCAPE_UNENCLOSED_FIELD=NONE
date_format='AUTO'
time_format='AUTO'
timestamp_format='AUTO'
binary_format='UTF-8'
field_optionally_enclosed_by='"'
null_if=''
EMPTY_FIELD_AS_NULL = FALSE
)
overwrite=TRUE
single=FALSE
max_file_size=5368709120
header=TRUE;
ls #my_temp_stage;
GET #my_temp_stage file:///tmp/ ;
This example:
Creates a temporary stage object in Snowflake, which will be discarded when you close your session.
Takes the results of your query and loads them into one (or more) csv files in that internal temporary stage, depending on size of your output. Notice how I didn't create another database object called a "FILE FORMAT", it's considered a best practice to do so, but you can do these one off extracts without creating that separate object if you don't mind having the command be so long.
Lists the files in the stage, so you can see what was created.
Pulls the files down using the GET, in this case this was run on my mac and the file(s) were placed in /tmp, if you are using Windoz you will need to modify a little bit.
Looked through the docs for a way to use Camel for ETL just as in the site's examples, except with these additional conditionals based on an md5 match.
Like the camel example, myetl/myinputdir would be monitored for any new file, and if found, file of ${filename} would be processed.
Except it would first wait for ${filename}.md5 to show up, which would contain the correct md5. If ${filename}.md5 never showed up, it would simply ignore the file until it did.
And if ${filename}.md5 did show up but the md5 didn't match, it would be processed but with an error condition.
Found suggestions to use crypto for matching, but have not figured out how to ignore the file until the matching .md5 file shows up. Really, these two files need to be processed as a matched pair for everything to work properly, and they may not arrive in the input directory at the exact same millisecond. Or alternately, the md5 file might show up a few milliseconds before the data file.
You could use an aggregator to combine the two files based on their file name. If your files are suitably named, then you can use the file name (without extension) as the correlation ID. Continue the route once completionSize equals 2. If you set groupExchanges to true then in your next route step you have access to both the file to compute the hash value for and the contents of the md5 file to compare the hash value against. Or if the md5 or content file never arrived within completionTimeout you can trigger whatever action is appropriate for your scenario.
I currently have an issue with my data import handler where ${dataimporter.last_index_time} is not granular enough to capture two events that happen within a second of each other, leading to issues where a record is skipped over in my database.
I am thinking to replace last_index_time with a simple atomically incrementing value as opposed to a datetime, but in order to do that I need to be able to set and read custom variables through solr that can be referenced in my data-config.xml file.
Alternatively, if I could find some way to set dataimporter.last_index_time, that would work just as well as I could ensure that the last_index_time is less than the newly-committed rows (and more importantly, that it is set from the same clock).
Does Solr support this?
Short answer: Yes it does
Long answer:
At work I'm passing parameters in request (DataImportHandler: Accessing request parameters) with default values set in handler (solrconfig.xml)
To sum up:
You can do use something like that in data-config.xml
${dataimporter.request.your_variable}
With request:
/dataimport&command=delta-import&clean=false&commit=true&your_variable=123
I want to read Database result into variables so I can use it for later requests.
How can i do it?
What if i want to return from database multiple
columns, or even rows? can loop the returned table same way i can
with "CSV Data Set Config"?
--edit--
Ok, i found this solution that uses regular expression to parse the response, but this solution and other like it doesn't work for me, because they require me to change SQL queries so Jmeter could parse them more "easily". I'm using Jmeter to do testing (load testing), and the last thing I want is to maintain 2 different codes, one for "testing" and other for "runtime".
Is there a "specific" JDBC Request solution that enable me to read result into variables using the concept of result-sets and columns?
Using The Regular Expression shouldn't affect what your SQL statement looks like. If you need to modify which part of the response you store in variable, use a Beanshell sampler with java code to parse out the response and store into a variable.
You can loop through the returned table, by using a FOREACH controller, referencing the variable name in the reg ex. Make sure in your reg ex, you set the match value to -1 to capture every possible match.