Mongodb - 100% cpu, 95%+ lock, low disk activity - arrays

I have a mongo database setup where I am running a lot of findAndModify-queries against. At first it was performing fine, doing ~400 queries and ~1000 updates per second (according to mongostat). This caused a 80-90% lock percentage, but it seemed reasonable given the amount of data throughput.
After a while it has slowed to a crawl and is now doing a meager ~20 queries / ~50 updates per second.
All of the queries are on one collection. The majority of the collections have a set of basic data (just key: value entries, no arrays or similar) that is untouched and then an array of downloads with the format + number of bytes downloaded. Example:
downloads: [
{
'bytes: 123131,
'format': extra
},
{
'bytes: 123131,
'format': extra_hd
}
...
]
A bit of searching tells me that big arrays are not good, but if the majority of documents only have 10-15 entries in this array (with a few outliers that have 1000+) should it still affect my instance this bad?
CPU load is near 100% constantly, lock % is near 100% constantly. The queries I use are indexed (I confirmed via explain()) so this should not be an issue.
Running iostat 1 gives me the following:
disk0 cpu load average
KB/t tps MB/s us sy id 1m 5m 15m
56.86 122 6.80 14 5 81 2.92 2.94 2.48
24.00 9 0.21 15 1 84 2.92 2.94 2.48
21.33 3 0.06 14 2 84 2.92 2.94 2.48
24.00 3 0.07 15 1 84 2.92 2.94 2.48
33.14 7 0.23 14 1 85 2.92 2.94 2.48
13.68 101 1.35 15 2 84 2.92 2.94 2.49
30.00 4 0.12 14 1 84 2.92 2.94 2.49
16.00 4 0.06 14 1 85 2.92 2.94 2.49
28.00 4 0.11 14 2 84 2.92 2.94 2.49
33.60 5 0.16 14 1 85 2.92 2.94 2.49
I am using mongodb 2.4.8, and while upgrading is an option I would prefer to avoid it. It is running on my local SSD disk on OS X. It will be transferred to run on a server, but I would like to fix or at least understand the performance issue before I move it.

Related

Create a running total in SQL with hours but only using work hours

This might be a strange question... but will try to explain the best i can ...
BTW : There is no chance in implementing through Stored Procedures... it should be made in SQL Query only ... But if the only option is SP, then i have to adapt to that ...
I have a table with the following elements :
RUN
WORKORDER
LOCATION
TRAVELTIME
NUMEQUIP
TOT_TIME
NO99
1
Start
NO99
2
Customer 1
112
1
8
NO99
3
Customer 2
18
11
88
NO99
4
Customer 3
22
93
744
NO99
5
Customer 4
34
3
24
I need to add a running DATE and HOUR by calculating the amount of time it takes from one line tho another BUT, and this is important, to take into consideration working hours ( from 9:00 to 13:00 and from 14:00 to 18:00 ( in US format : from 9am to 1pm, and 2pm to 6pm)... As example ... considering that my start date and time would be 10/May/2022 9:00 :
RUN
WORKORDER
LOCATION
TRAVELTIME
NUMEQUIP
TOT_TIME
DATE
TIME
NO99
1
Start
10/05/22
9:00
NO99
2
Customer 1
112
1
8
10/05/22
10:52
NO99
3
Customer 2
18
11
88
10/05/22
11:18
NO99
4
Customer 3
22
93
744
10/05/22
14:08
NO99
5
Customer 4
34
3
24
12/05/22
10:06
This result is achieved by calculating the estimated time to make the trip between customers (TRAVELTIME), and after arriving is also added the time spent on maintenance (TOT_TIME that is Number of Equipments (NUMEQUIP) vs 8 minutes per equipment)... By this, and since customer 3 will have 744 minutes (12 h and 58 minutes) in maintenance... and those minutes will spawn through 3 days, the result should be as shown...
With the following query i can have almost the desired effect... but cannot take into account only work hours ... and all time is continuous...
Select
RUN,WORKORDER,LOCATION,TRAVELTIME,
DateAdd(mi,temprunningtime-TOT_TIME,'9:00') As TIME,
NUMEQUIP,NUMEQUIP*8 AS TOT_TIME,sum(MYTABLE.TRAVELTIME +
MYTABLE.TOT_TIME) OVER (ORDER BY MYTABLE.ORDER) AS temprunningtime
FROM MYTABLE
With this query (slightly altered) i get an running TIME, but does not take into account the 13:00-14:00 stop, and the 18:00-9:00 stop...
It might be a bit confusing but any ideias on this would be very appreciated... and i will try to explain anyway i can...

How to sum two columns of different tables in sql

I created two tables book and cd
**Books** **Book_id** **author** **publisher** **rate**
angular2 132 venkat ts 1900
angular 160 venkat ts 1500
html 5 165 henry vk 1500
html 231 henry vk 2500
css 256 mark adobe 1600
java 352 john gulberg 4500
c# 450 henry adobe 1600
jsp 451 henry vk 2500
ext js 555 kv venkat w3 5102
html 560 kv venkat gulberg 5000
java2 561 john gulberg 9500
java8 651 henry vk 1650
js 654 henry ts 2500
java 777 babbage adobe 5200
phython 842 john ts 1500
spring 852 henry w3 6230
spring 895 mark tut 4250
ext js 965 henry gulberg 4500
book_id Cd_name Cd_price
132 angular2 500
132 angular1 600
132 angular basics 600
132 angular expert 900
160 begineer_course 1200
160 angular_templates 500
165 html_tutorials 900
165 bootstrap 1000
256 css styles 650
256 expert css 900
555 extjs 1200
555 exjs_applications 500
777 core java 2500
777 java swing 4500
777 java tutorials 1500
842 phython 650
852 spring 900
852 spring mvc 900
In the above two tables i want to join the books,author,cd_name and the total cost of book and cd for each id.
Expected Output
Books Book_id author cd_name total price
angular2 132 venkat angular2 2400
angular2 132 venkat angular basics 2100
angular2 132 venkat angular expert 2800
java 777 babbage core java 7700
Like the above result i need to get the total cost for all the books and cd
In case not all books had a CD:
SELECT A.Books
, A.Book_ID
, A.Author
, B.CD_Name
, A.rate+COALESCE(B.Cd_price,0) AS TOTAL_PRICE
FROM BOOK A
LEFT JOIN CD B ON A.BOOK_ID = B.BOOK_ID
Question's author evidenced that "The table name is book not Books"
I used originally BOOKS as a "suggestion" because (usually) tables name are plural.
Try this
select books, b.book_id, author, cd_name, (b.rate+c.cd_price) as total_price from book b
join cd c on b.book_id = c.book_id

SQL Server 2008 Varbinary(Max) column - 28Mb of images creating 3.2Gb database

This is the first time I've tried to store images in my DB instead of the file server and I'm regretting it so far. I can't use filestream because my host doesn't support it so I'm using a varbinary(max) column. I'm keeping track of the image sizes I insert and there's about 28Mb so far but the database is at 3.2Gb which is just crazy. Am I better to varbinary(XXXX) to reduce this - is SQL Server reserving space for the MAX?
Using MS SQL Server 2008 btw
Here is the top Table sizes:
TableName RowCounts TotalSpaceKB UsedSpaceKB UnusedSpaceKB
Municipality 1028316 64264 64232 32
Image 665 33616 33408 208
User 320 248 224 24
SettingUser 5910 264 160 104
Region 1418 136 136 0
ImageUser 665 56 56 0
ConversationItem 164 56 56 0
Setting 316 48 48 0
Culture 378 40 40 0
UserTrack 442 40 40 0
Numbers 1000 32 32 0
Country 240 32 32 0
Conversation 52 32 32 0
CountryIp 0 88 32 56
ReportUser 0 16 16 0
ConversationItemImage 0 16 16 0
Here's the result for exec sp_spaceused:
database_size unallocated space
3268.88 MB 0.84 MB
reserved data index_size unused
359592 KB 291744 KB 66600 KB 1248 KB
I should probably also mention that there is a Geography Column on the Municiplity Table too in case this has any impact due to spatial indexes... I've used this plenty of times in the past and had no issues but I've never had 1M+ records either usually less than 20k
Make sure that all that space is being used by the actual data, and not the log file.
Shrinking the log file will only remove unused space. In order to clear entries before shrinking it, you would need to backup or truncate the log before hand (Warning: If you care at all about your log chain, this could break it)

Appengine over quota

I'm using appengine already for a long time, but today, my application is already after 3 hours over quota (> my daily cost). The dashboard howevers show almost no resources have been used, so it should be nowhere near this daily limit.
Also strange is that, dispite the fact that the dashboard says my daily limit is reached, I noticed that I have no problem for retrieving. Only writing to the datastore gives an over quota exception (com.google.apphosting.api.ApiProxy$OverQuotaException: The API call datastore_v3.Put() required more quota than is available). The statistics below however show there where not a lot of writes. If I look to Quota details, all indicators are Okay.
Billing Status: Enabled ( Daily budget: $2.00 ) - Settings Quotas reset every 24 hours. Next reset: 21 hrs
Resource Usage Billable Price Cost
Frontend Instance Hours 4.30 Instance Hours 0.00 $0.08/ Hour $0.00
Backend Instance Hours 0.00 Instance Hours 0.00 $0.08/ Hour $0.00
Datastore Stored Data 2.86 GBytes 1.86 $0.008/ GByte-day $0.02
Logs Stored Data 0.04 GBytes 0.00 $0.008/ GByte-day $0.00
Task Queue Stored Task Bytes0.00 GBytes 0.00 $0.008/ GByte-day $0.00
Blobstore Stored Data 0.00 GBytes 0.00 $0.0043/ GByte-day $0.00
Code and Static File Storage0.12 GBytes 0.00 $0.0043/ GByte-day $0.00
Datastore Write Operations 0.06 Million Ops 0.01 $1.00/ Million Ops $0.02
Datastore Read Operations 0.01 Million Ops 0.00 $0.70/ Million Ops $0.00
Datastore Small Operations 0.00 Million Ops 0.00 $0.10/ Million Ops $0.00
Outgoing Bandwidth 0.01 GBytes 0.00 $0.12/ GByte $0.00
Recipients Emailed 0 0 $0.01/ 100 Recipients$0.00
Stanzas Sent 0 0 $0.10/ 100K Stanzas $0.00
Channels Created 0% 0 of 95,040 0 $0.01/ 100 Opens $0.00
Logs Read Bandwidth 0.00 GBytes 0.00 $0.12/ GByte $0.00
PageSpeed Outgoing Bandwidth0.01 GBytes 0.01 $0.39/ GByte $0.01
SSL VIPs 0 0 $1.30/ Day $0.00
SSL SNI Certificates 0 0 $0.06/ Day $0.00
Estimated cost for the last 3 hours: $2.00* / $2.00

Do ExtJS charts perform better than FusionCharts?

We're thinking about replacing FusionCharts with ExtJS charts in our application, since:
We already use ExtJS for our entire UI. It would be nice to remove the overhead and expense of another commercial third-party dependency and API.
We'd like to be able to display these charts on Flash-less mobile devices.
It's much harder to extend and manage FusionCharts' Flash components than normal DOM objects.
A few particular pages of our app are chock full of charts (on the order of hundreds of spark-like charts), and Flash is devouring memory like it's going out of style.
I've looked at FusionCharts's JavaScript fallback, and it's just not aesthetically sufficient. Plus, I don't want a JavaScript implementation that's a "fallback".
We're currently on ExtJS 3.2.0. Upgrading to 4.x is out of the question for the short term, but we could potentially sandbox Ext 4 to use just its charting if we deem it worth the effort.
So my question is essentially does ExtJS 4's JavaScript charting perform significantly better than FusionCharts Flash charting? I'm mostly concerned with memory usage, secondarily with render time.
I see this Stack Overflow question indicating that, at least as of August 2011, Ext charts weren't really up to snuff. I know Sencha was concentrating on improving stability and performance in 4.1. Does anyone know if it's gotten better since then?
TL;DR
I saw phenomenal reductions in memory usage, CPU load, and render time by using charts in ExtJS 4.0.7 rather than FusionCharts 3.2, usually on the order of 70–85%.
Intro
I recently got some time to test Ext's charting. It was mildly painful rewriting the components to integrate Ext 4 charts into Ext 3 panels, but with few days' work, I could chart actual data from the server.
The basic charting problem I was trying to solve is shown in the image below:
We chart a trend of power readings for a number of outlets on a device. This worked fine in FusionCharts until we recently we started rendering devices with 168 outlets (and potentially several of these devices on a single page). I suspected that no browser would be able to handle that much Flash, so I built a basic page to render one of these devices and tested it in a few different browsers.
Test Results
"F" means FusionCharts. "E" means ExtJS.
Hardware:
OS X: 15-inch MacBook Pro 5,1, 2.4 GHz Intel Core 2 Duo, 4 GB RAM
Win7: 21-inch iMac 4,1, 1.83 GHz Intel Core 2 Duo, 2 GB RAM
WinXP: same iMac running XP in VirtualPC (1 GB RAM)
=========
OS X 10.7
=========
Browser/Test Real Mem (MB) Virt Mem (MB) Priv Mem (MB) CPU (%) Render (s)
--------------------------------------------------------------------------------------------
Chrome 17.0.963.56
F1 653 532 333 14 22.8
F2 648 535 336 15 22.7
F3 656 538 339 15 22.3
--- --- --- --- ----
652 535 336 15 22.6
E1 104 129 80 0 4.0
E2 104 129 80 0 4.7
E3 104 129 80 0 3.7
--- --- --- --- ----
104 129 80 0 4.1
+/- -84% -76% -76% -100% -82%
Firefox 10.0.2
F1 905 450 257 14 10.1
F2 889 435 242 15 10.5
F3 889 465 272 15 10.1
--- --- --- --- ----
894 450 257 15 10.2
E1 239 230 161 0 3.5
E2 256 215 177 0 3.7
E3 253 218 181 0 4.6
--- --- --- --- ----
249 221 173 0 3.9
+/- -72% -51% -67% -100% -62%
Safari 5.1.3
F1 1070 998 717 16 22.7
F2 1130 993 670 16 23.0
F3 1120 902 631 17 22.9
---- --- --- --- ----
1107 964 673 16 22.9
E1 153 290 125 0 3.4
E2 153 291 125 0 3.5
E3 153 291 125 0 3.3
---- --- --- --- ----
153 291 125 0 3.4
+/- -86% -70% -81% -100% -85%
=========
Windows 7
=========
Browser Working Set (MB) Priv Working Set (MB) Commit Size (MB) CPU (%) Render (s)
------------------------------------------------------------------------------------------------------
Chrome 17.0.963.56
F1 638 619 633 45 16.9
F2 639 620 633 43 16.8
F3 639 620 633 45 16.9
--- --- --- --- ----
639 620 633 45 16.9
E1 100 85 96 0 4.4
E2 95 81 92 0 4.5
E3 101 87 98 0 4.3
--- --- --- --- ----
99 84 95 0 4.4
+/- -85% -87% -85% -100% -74%
Firefox 10.0.2
F1 650 638 657 52 11.5
F2 655 641 659 54 16.9
F3 650 638 656 52 11.4
--- --- --- --- ----
651 639 657 52 13.3
E1 138 111 119 0 3.6
E2 141 113 121 0 3.6
E3 134 106 114 0 3.8
--- --- --- --- ----
138 110 118 0 3.6
+/- -79% -83% -82% -100% -73%
IE 9.0.8112.16421
F1 688 660 702 19 13.1
F2 645 617 661 16 19.0
F3 644 615 660 15 19.0
--- --- --- --- ----
659 631 674 17 17.0
E1 100 73 90 0 4.8
E2 98 73 90 0 4.5
E3 99 73 90 0 4.3
--- --- --- --- ----
99 73 90 0 4.5
+/- -85% -88% -87% -100% -74%
==========
Windows XP
==========
Browser/Test Mem Usage (MB) Virt Mem Usage (MB) CPU (%) Render (s)
--------------------------------------------------------------------------------
IE 8.0.6001.18702
F1 653 658 56 19.5
F2 652 658 58 19.6
F3 652 658 60 18.9
--- --- --- ----
652 658 58 19.3
E1 272 266 2 38.5
E2 271 266 2 37.4
E3 271 266 2 37.3
--- --- --- ----
271 266 2 37.7
+/- -58% -60% -97% +95%
IE 7.0.5730.13
F1 721 726 80 29.1
F2 691 698 75 25.9
F3 695 698 78 27.0
--- --- --- ----
702 707 78 27.3
E1 302 294 1 67.4
E2 301 294 0 66.5
E3 301 294 0 65.8
--- --- --- ----
301 294 0 66.6
+/- -57% -68% -100% +144%
Notes:
- CPU (%) was measured once the charts had finished rendering and
the browser was idling.
- Render (s) was the time measured between when the data finished
loading and when the charts were fully rendered and usable.
Conclusions
In every metric other than render time on IE8 and IE7, ExtJS charts outperformed FusionCharts by a wide margin. Although the tests were specific to our use case, I would expect to see similar (if less drastic) results in similar situations — i.e., lots of charts on a single page.
This is to say nothing of the qualitative benefits of native charts, like real DOM scripting and styling, direct integration with the rest of the ExtJS framework, and access to charts on Flash-less mobile devices. If you can invest the time, charting in Ext 4 is a huge win.
From my experience ExtJs 4 charts are still raw and have a lot of issues. For example the Time axis is really buggy, and I have to find workarounds just to display several series of line chart (finally replaced it with a numeric axis, loading timestamps into it). It also has performance issues on big data sets, so I have to group data and reduce it to smaller sets.
But on another hand I'm really glad that Sencha eventually replaced flash charts with html5 ones. It actually gives you a freedom to modify and adjust chart as you want. Sometimes it requires looking into source code of a chart, but anyway it's not flash and it's cool! I believe Sencha will improve their charts soon.

Resources