Estimation of the Data logging size [closed] - c

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have a device generating some values say N, each value having 32 bit.
I am logging these values every 10 seconds by writing a new row in an excel file. I will be creating a new file every day.
I have to estimate the hard disk storage capacity necessary to store these log files for a period of 10 years.
Can someone give any hints regarding the calculation of the size of log file generated per day ?

Assuming worst case 2's complement 32-bit ASCII...
-2147483648 is 13 characters per value
1 value / 10 seconds
3600 seconds / hour
24 hours /day
that's 112,320 bytes per day, per number of values N,
"round" that off to 112,640 bytes (divisible by 1024) per day
365.25 days per year
10 years
that's N * 411,417,600 or slightly more than N * 4Mbytes
So if N was 10, that would be slightly more than 41MBytes.

Create a sample spreadsheet. Add 1000 rows and save it as a different name.
That will give an estimate for per-row cost.
Incremental writing is not a good scenario for complex formats such as spread sheet. Text log file could be appended.
A spread sheet would tend to re-write whole file for each flush.

Related

How to stuff any number of values in 8-10 bytes of data for n number of 16 bit values? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am working on algorithm where i can have any number of 16 bit values(For instance i have 1000 16 bit values , and all are sensor data, so no particular series or repetition). I want to stuff all of this data into an 8 or a 10 byte array(each and every value of the 1000 16 bits numbers should be inside the 10 byte array) . The information should be such that i can also easily decode to read each and every value from the 1000 values.
I have thought of using sin function by dividing the values by 100 so every data point would always be in 8 bits(0-1 sin value range) , but that only covers up small range of data and not huge number of values.
Pardon me if i am asking for too much. I am just curious if its possible or not.
The answer to this question is rather obvious with a little knowledge in information sciences. It is not possible to store that much information in so little memory, and the data you are talking about just contains too much information.
Some data, like repetitive data or data which is following some structure (like constantly rising values), contains very little information. The task of compression algorithms is to figure out the structure or repetition and instead of storing the pure data to store the structure or rule how to reproduce the data instead.
In your case, the data is coming from sensors and unless you are willing to lose a massive amount of information, you will not be able to generate a compressed version of it with a compression factor in the magnitude your are talking about (1000 × 2 bytes into 10 bytes). If your sensors more or less produce the same values all the time with just a little jitter, a good compression can be achieved (but for this your question is way to broad to be answered here) but it will probably never be in the range of reducing your 1000 values to 10 bytes.

Time/date implementation on a MCU [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am working on a MCU and my aim is to implement time/date on the MCU.
I use a timer that ticks per seconds and store it on uint32_t count that has enough size to store 136 years. I want to have 2000 as a reference and max should be 2099.
here is my data struct:
typedef struct
{
uint8_t sec; // Seconds. [0-60] (1 leap second)
uint8_t min; // Minutes. [0-59]
uint8_t hour; // Hours. [0-23]
uint8_t day; // Day. [1-31]
uint8_t month; // Month. [0-11]
uint8_t year; // Year - from 2000. [00-99]
} osal_time_t;
What is a best way to convert the seconds(uint32_t count) to min/hr/day/month/year correctly and by using the lowest resources?
Time, hour, and year seems simple but day gets tricky with 28-29-30-31 days and feb is 29 each 4 years.
I see linux source code implementations but I think it is designed an OS, not best of a humble MCU.
Can anyone hint what kind of algorithm should I use in a MCU so that it requires min. resources?
As an example, what algorithm is used to calculate this http://www.mathcats.com/explore/elapsedtime.html
If you have any code snippet, I would appreciate if you could share it.
You have to just do the math, there is no other way around it. You are converting from base 2 to base 10 (base 60 represented in base 10).
Likewise for month day stuff, you have to grind through that as well, with a table of some sort for days per month and deal with leap year.
The alternative to doing the math is changing how you count, using more memory but less calculation. Basically a BCD approach. When the ones of seconds rolls from 9 to 10 then increment tens of seconds and set ones of seconds to a 0. Repeat all the way up to the date. Or meet half way and seconds over 59 rolls to zero and increments minutes...then do the base 10 stuff to separate the tens from ones of seconds minutes hours. you could use a table for that if you dont have a divide.
This is not a programming problem, because you can't do this reliably with just a microcontroller. An internal RC oscillator will be way too inaccurate, and even if you use a high accuracy external crystal oscillator, it will drift over time and may vary with temperature.
The only correct solution is to add a real-time clock circuit to the hardware, preferably together with a back-up battery. How to communicate with the real-time clock circuit is hardware-specific.
It is better to ask such questions which are on the borderland to hardware on https://electronics.stackexchange.com/.

Binary classification of sensor data using minimal code space [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am trying to classify the events above as 1 or 0. 1 would be the lower values and 0 would be the higher values. Usually the data is does not look as clean as this. Currently the approach I am taking is to have two different thresholds so that in order to go from 0 to 1 it has to go past the 1 to 0 threshold and it has to be above for 20 sensor values. This threshold is set to the highest value I receive minus ten percent of that value. I dont think a machine learning approach will work because I have too few features to work with and also the implementation has to take up minimal code space. I am hoping someone may be able to point me in the direction of a known algorithm that would apply well to this sort of problem, googling it and checking my other sources isnt producing great results. The current implementation is very effective and the hardware inst going to change.
Currently the approach I am taking is to have two different thresholds so that in order to go from 0 to 1 it has to go past the 1 to 0 threshold and it has to be above for 20 sensor values
Calculate the area on your graph of those 20 sensor values. If the area is greater than a threshold (perhaps half the peak value) assign it as 1, else assign it as 0.
Since your measurements are one unit wide (pixels, or sensor readings) the area ends up being the sum of the 20 sensor values.

how google channel api pricing works? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I read on the official doc that to open a new channel it will cost $0.01. and it will last 2 hours.
so if I have 1000 concurrent users who use my site daily for 2 hours.
total cost will be 1000*$0.01 = 10$ daily.
bandwidth cost + cpu cost. right ?
Do they charge hourly too ?
i.e. if concurrent users use site daily for 4 hours, the resultant total cost will be 1000*0.01*2=$20 ?
It's only $0.01 per 100 channels, which equates to $0.0001 per channel. You can also change the lifetime of the channel token from 2 hours (you can make it greater or smaller), so you can effectively reuse channel tokens, depending on how they're used for your application.
So, if you leave the channel token lifetime at 2 hours, it would be
'1000 * $0.0001 * 2 = $0.2` for the cost of channel token creation alone.
The rest of the cost, as you've indicated here, will depend on your bandwidth, CPU, and other server-side usage costs.
Seems like the calculation shown in https://developers.google.com/appengine/docs/billing is also wrong.

Estimating database size [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I was wondering what you do when developing a new application in terms of estimating database size.
E.g. I am planning to launch a website, and I am having a hard time estimating what size I could expect my database to grow. I don't expect you to tell me what size my database will be, but I'd like to know if there are general principles in estimating this.
E.g. When Jeff developed StackOverflow, he (presumably) guesstimated his database size and growth.
My dilemma is that I am going for a hosted solution for my web application (its about cost at this stage), and preferably don't want to shoot myself in the foot by not purchasing enough SQL Server space (they charge a premium for this).
If you have a database schema, sizing is pretty straightforward ... it's just estimated rows * avg row size for each table * some factor for indexes * some other factor for overhead. Given the ridiculously low price of storage nowadays, sizing often isn't a problem unless you intend to have a very high traffic site (or are building an app for a large enterprise).
For my own sizing exercises, I've always created an excel spreadsheet listing:
col 1: each table that will grow
col 2: estimated column size in bytes
col 3: estimated # of rows (per year or max, depending on application)
col 4: index factor (I always set this to 2)
col 5: overhead factor (I always set this to 1.2)
col 6: total column (col 2 X 3 X 4 X 5)
The sum of col 6 (total column), plus the initial size of your database without growth tables, is your size estimate. You can get much more scientific, but this is my quick and dirty way.
Determine:
how many visitors per day, V
how many records of each type will be created per visit, N1, N2, N3...
the size of each record type, S1, S2, S3...
EDIT: forgot index factor which a good rule of thumb is 2 times
Total growth per day = 2* V * (N1*S1 + N2*S2 + N3*S3 + ...)
My rules-of-thumb to follow are
how many users do I expect?
what content can they post?
how big is a user record?
how big is each content item a user can add?
how much will I be adding?
how long will those content items live? forever? just a couple weeks?
Multiply the user record size times the number of users; add the number of users times the content item size; multiply by two (for a convenient fudge factor).
The cost of estimating is likely to be larger than the cost of the storage
Most hosting providers sell capacity by the ammount used at the end of each month, so just let it run

Resources