I have xml from txt file i got strange error(for me)
There is 2 customers and 6 products on the txt file
as is
Result set is
CUSTID ORDER ID
98295 29199752211 0 1 2321
98295 29199752211 0 1 76
98295 29199752211 0 2 179
98295 29199752211 0 3 180
98295 29199752211 0 4 320
98295 29199752211 0 5 NULL
Why the cust id same ? there is 2 in text file. i'll be glad if i can use some help.
SELECT
(SELECT LNGNO FROM ARTUT13.DBO.TBLFATURA WHERE TXTOZELKOD=(c6.value('(//FISLER/FIS/FISID)[1]','VARCHAR(100)'))),--[LNGNO]
0,--[BYTTUR]
c6.value('(KALEMNO)[1]','VARCHAR(100)'),--[LNGKALEMSIRA]
(SELECT LNGKOD FROM ARTUT13.DBO.TBLURUN WHERE TXTKOD=(c6.value('(URUNKODU)[1]','VARCHAR(100)'))),
c6.value('(MIKTAR)[1]','VARCHAR(100)'),--[DBLMIKTAR]
1,--[BYTBIRIMSIRA]
1,--[DBLCEVRIM]
c6.value('(BIRIMFIYAT)[1]','VARCHAR(100)'),--[DBLBIRIMFIYAT]
0,--[BYTKAYITTIP]
0,--[BYTDETAYMAL]
c6.value('(KDV)[1]','VARCHAR(100)'),--[DBLKDVORANI]
c6.value('(FIYAT)[1]','VARCHAR(100)'),--[DBLNETFIYAT]
'',--[TXTOZELKOD]
0,--[LNGVADEGUNU]
GETDATE(),--[TRHSONISLEMTARIHI]
'MUHASEBE2',--[TXTSONISLEMHOST]
'',--[DBLOTV]
c6.value('(//FISLER/FIS/FISID)[1]','VARCHAR(100)'),--[TXTOZELKOD1]
''--[TXTOZELKOD2]
from
(select cast(c1 as xml) from OPENROWSET (BULK 'C:\AKTAR\FATURA.txt',SINGLE_BLOB
) as T1(c1) )as T2(c2)
outer apply c2.nodes('FISLER/FIS/KALEMLER/KALEM') T6(c6)
Text File contains
<FISLER>
<FIS>
<FISTIPI>SATIS</FISTIPI>
<FISID>29199752211</FISID>
<FISNO>a67502</FISNO>
<IPTAL>0</IPTAL>
<TARIH>13.02.2013</TARIH>
<MUSKODU>35170339P</MUSKODU>
<MUSADI>MEHMET PEHLIVAN - MORTAN GIDA MEHMET PEHLIVAN</MUSADI>
<VERGIDAIRESI>KARABURUN MAL MD</VERGIDAIRESI>
<VERGINO>47035582576</VERGINO>
<DEPOKODU>01</DEPOKODU>
<ODEMETIPI>6</ODEMETIPI>
<TOPLAMBRUT>1200.24</TOPLAMBRUT>
<TOPLAMISKONTO>60.01</TOPLAMISKONTO>
<TOPLAMKDV>205.24</TOPLAMKDV>
<TOPLAMNET>1345.47</TOPLAMNET>
<SATISTEMSILCISIKODU>001</SATISTEMSILCISIKODU>
<DAGITICIKODU></DAGITICIKODU>
<ARACKODU></ARACKODU>
<ARACPLAKA></ARACPLAKA>
<SEVKNO></SEVKNO>
<VADETARIHI>06.03.2013</VADETARIHI>
<KALEMLER>
<KALEM>
<KALEMNO>1</KALEMNO>
<URUNKODU>4009011024</URUNKODU>
<URUNADI>EFE KLASİK RAKI45º-100clx12AD TAVA( 63,50 FİYATLI)</URUNADI>
<MIKTAR>24</MIKTAR>
<BIRIMFIYAT>50.01</BIRIMFIYAT>
<FIYAT>1200.24</FIYAT>
<BIRIM></BIRIM>
<KDV>18</KDV>
<ISKONTOLAR>
<ISKONTO>
<price>1200.24</price>
<KODU></KODU>
<ADI>Ürün İsk.1</ADI>
<TIPI></TIPI>
<ORAN>5</ORAN>
<TUTAR>60.012</TUTAR>
</ISKONTO>
</ISKONTOLAR>
</KALEM>
</KALEMLER>
</FIS>
<FIS>
<FISTIPI>SATIS</FISTIPI>
<FISID>29199773107</FISID>
<FISNO>a67511</FISNO>
<IPTAL>0</IPTAL>
<TARIH>13.02.2013</TARIH>
<MUSKODU>100242</MUSKODU>
<MUSADI>NUMBER ONE APART OTEL RESTAURANT</MUSADI>
<VERGIDAIRESI>KARABURUN</VERGIDAIRESI>
<VERGINO>50545253560</VERGINO>
<DEPOKODU>01</DEPOKODU>
<ODEMETIPI>6</ODEMETIPI>
<TOPLAMBRUT>2634.24</TOPLAMBRUT>
<TOPLAMISKONTO>195.21</TOPLAMISKONTO>
<TOPLAMKDV>439.03</TOPLAMKDV>
<TOPLAMNET>2878.06</TOPLAMNET>
<SATISTEMSILCISIKODU>001</SATISTEMSILCISIKODU>
<DAGITICIKODU></DAGITICIKODU>
<ARACKODU></ARACKODU>
<ARACPLAKA></ARACPLAKA>
<SEVKNO></SEVKNO>
<VADETARIHI>06.03.2013</VADETARIHI>
<KALEMLER>
<KALEM>
<KALEMNO>1</KALEMNO>
<URUNKODU>4001017212</URUNKODU>
<URUNADI>EFE YAŞ ÜZÜM RAKISI (45º) - 70 cl 12LI KOLİ</URUNADI>
<MIKTAR>12</MIKTAR>
<BIRIMFIYAT>47.03</BIRIMFIYAT>
<FIYAT>564.36</FIYAT>
<BIRIM></BIRIM>
<KDV>18</KDV>
<ISKONTOLAR>
<ISKONTO>
<price>564.36</price>
<KODU></KODU>
<ADI>Ürün İsk.1</ADI>
<TIPI></TIPI>
<ORAN>10</ORAN>
<TUTAR>56.436</TUTAR>
</ISKONTO>
</ISKONTOLAR>
</KALEM>
<KALEM>
<KALEMNO>2</KALEMNO>
<URUNKODU>4001012324</URUNKODU>
<URUNADI>EFE YAŞ ÜZÜM RAKISI (45º) - 20 cl 24 LU KOLİ</URUNADI>
<MIKTAR>24</MIKTAR>
<BIRIMFIYAT>16.07</BIRIMFIYAT>
<FIYAT>385.68</FIYAT>
<BIRIM></BIRIM>
<KDV>18</KDV>
<ISKONTOLAR>
<ISKONTO>
<price>385.68</price>
<KODU></KODU>
<ADI>Ürün İsk.1</ADI>
<TIPI></TIPI>
<ORAN>10</ORAN>
<TUTAR>38.568</TUTAR>
</ISKONTO>
</ISKONTOLAR>
</KALEM>
<KALEM>
<KALEMNO>3</KALEMNO>
<URUNKODU>4001013724</URUNKODU>
<URUNADI>EFE YAŞ ÜZÜM RAKISI (45º) - 35 cl 24 LU KOLİ</URUNADI>
<MIKTAR>24</MIKTAR>
<BIRIMFIYAT>26.66</BIRIMFIYAT>
<FIYAT>639.84</FIYAT>
<BIRIM></BIRIM>
<KDV>18</KDV>
<ISKONTOLAR>
<ISKONTO>
<price>639.84</price>
<KODU></KODU>
<ADI>Ürün İsk.1</ADI>
<TIPI></TIPI>
<ORAN>10</ORAN>
<TUTAR>63.984</TUTAR>
</ISKONTO>
</ISKONTOLAR>
</KALEM>
<KALEM>
<KALEMNO>4</KALEMNO>
<URUNKODU>4001011013</URUNKODU>
<URUNADI>EFE YAŞ ÜZÜM RAKISI (45º) - 100 cl 12LI TAVA</URUNADI>
<MIKTAR>6</MIKTAR>
<BIRIMFIYAT>60.37</BIRIMFIYAT>
<FIYAT>362.22</FIYAT>
<BIRIM></BIRIM>
<KDV>18</KDV>
<ISKONTOLAR>
<ISKONTO>
<price>362.22</price>
<KODU></KODU>
<ADI>Ürün İsk.1</ADI>
<TIPI></TIPI>
<ORAN>10</ORAN>
<TUTAR>36.222</TUTAR>
</ISKONTO>
</ISKONTOLAR>
</KALEM>
<KALEM>
<KALEMNO>5</KALEMNO>
<URUNKODU>4010017001</URUNKODU>
<URUNADI>EFE 5 YILLIK RAKI45º-70clx3AD KOLİ</URUNADI>
<MIKTAR>6</MIKTAR>
<BIRIMFIYAT>113.69</BIRIMFIYAT>
<FIYAT>682.14</FIYAT>
<BIRIM></BIRIM>
<KDV>18</KDV>
</KALEM>
</KALEMLER>
</FIS>
</FISLER>
1) Your question is not clear: how can be extracted CUSTID from that XML ? There is not CUSTID element in that XML.
2) Very likely, the cause is (if I look at your source code) the usage of absolute references (.value('(//element...)[1]',...)) instead of relative references (.value('(element...)[1]',...)).
Sample:
DECLARE #x XML;
SET #x = N'<...>';
SELECT c6.value('(//FISLER/FIS/FISID)[1]','VARCHAR(100)') AS AbsoluteRef_FISID
FROM #x.nodes('FISLER/FIS/KALEMLER/KALEM') T6(c6)
SELECT c6.value('(FISID)[1]','VARCHAR(100)') AS RelativeRef_FISID
FROM #x.nodes('FISLER/FIS') T6(c6)
Results:
AbsoluteRef_FISID
-------------------
29199752211
29199752211
29199752211
29199752211
29199752211
29199752211
RelativeRef_FISID
-------------------
29199752211
29199773107
// means absolute references and c6.value('(//FISLER/FIS/FISID)[1]','VARCHAR(100)') will extract only the first ([1]) FISID value from that XML.
.value('(FISID)[1]',...) uses a relative reference (relative to nodes('FISLER/FIS') T6(c6)) and the result will contains all FISID values.
If your run this query
SELECT c6.query('.') AS XmlNode
FROM #x.nodes('FISLER/FIS') T6(c6);
you will get two rows meaning that .nodes('FISLER/FIS') will extract two rows
XmlNode
----------------------------------------------------------------------------
<FIS><FISTIPI>SATIS</FISTIPI><FISID>29199752211</FISID><FISNO>a67502</FISNO>
<FIS><FISTIPI>SATIS</FISTIPI><FISID>29199773107</FISID><FISNO>a67511</FISNO>
from #x XML variable. Starting from this point (two rows), the value method .value('(FISID)[1]',...) will extract the first FISID ((FISID)[1]) for every row. Thus, you will get two FISID values.
SQLFiddle demo
Related
code as follow
db=database("dfs://db1",VALUE,1 2 3)
timestamp = [09:34:07,09:36:42,09:36:51,09:36:59,09:32:47,09:35:26,09:34:16,09:34:26,09:38:12]
sym = `C`MS`MS`MS`IBM`IBM`C`C`C
price= 49.6 29.46 29.52 30.02 174.97 175.23 50.76 50.32 51.29
qty = 2200 1900 2100 3200 6800 5400 1300 2500 8800
t = table(timestamp, sym, qty, price)
dt=db.createTable(t,`dt).append!(t)
How much disk space does this table DT consume?
You can use the getTabletsMeta function to query the disk space usage of a partition table. The code is as below, and the unit of the return value is Byte:
def diskUsage(database, table){
return select sum(diskUsage) from getTabletsMeta("/"+database+"/%", table, true, -1);
}
pnodeRun(diskUsage{"db1", "t1"})
I have estimated time on a job but when I add the employee's (in this case 2) hours, it will duplicate the estimated. I need to divide by the number of results (maybe employee records) to get the correct answer.
SQL pull from database.
SELECT
LaborDtl.JobNum,
LaborDtl.ClockInDate,
LaborDtl.OprSeq,
EmpBasic.Name,
(LaborDtl.LaborHrs) as [TotalHrs],
((JobOper.EstSetHours + JobOper.EstProdHours) / (COUNT (EmpBasic.Name))) as [TotEstHrs],
LaborDtl.ResourceGrpID
FROM Erp.LaborDtl
left outer JOIN Erp.JobOper ON
JobOper.JobNum = LaborDtl.JobNum
AND JobOper.OprSeq = LaborDtl.OprSeq
JOIN Erp.EmpBasic ON
EmpBasic.EmpID = LaborDtl.EmployeeNum
WHERE LaborDtl.Complete = '1'
AND LaborDtl.ClockInDate = '2019-7-1'
AND LaborDtl.ResourceGrpID = '5-XM-C'
AND LaborDtl.JobNum = 'PA16742'
GROUP BY
LaborDtl.JobNum,
LaborDtl.ClockInDate,
LaborDtl.OprSeq,
EmpBasic.Name,
LaborDtl.LaborHrs,
JobOper.EstSetHours,
JobOper.EstProdHours,
LaborDtl.EmployeeNum,
LaborDtl.ResourceGrpID
JobNum ClockInDate OprSeq Name TotalHrs TotEstHrs ResourceGrpID
pa16742 2019-07-01 20 Jerry Adam 1.6300 5.00 5-XM-C
PA16742 2019-07-01 20 Xue Lee 2.68000 5.00 5-XM-C
In this case, the TotEstHrs should be 2.5 on each line.
I think this does what you want:
((JobOper.EstSetHours + JobOper.EstProdHours) / SUM(COUNT(EmpBasic.Name))
OVER ()) as [TotEstHrs],
It adds the count over all the rows and then does the division.
For the below data I want to order it by AverageOfTotal then take the top ItemNumbers where the sum of the average is up to x.
ItemNumber AverageOfTotal
item-1 0.0235
item-2 0.0149
item-3 0.0203
item-4 0.0101
item-5 0.0084
item-6 0.0096
item-7 0.0092
item-8 0.0062
item-9 0.0069
item-10 0.0084
item-11 0.0132
item-12 0.0058
item-13 0.0094
item-14 0.0028
item-15 0.0061
item-16 0.0047
item-17 0.0038
item-18 0.0021
item-19 0.004
item-20 0.0083
item-21 0.0048
item-22 0.0058
item-23 0.0153
item-24 0.0025
item-25 0.0022
item-26 0.0086
item-27 0.0076
item-28 0.0097
item-29 0.0009
item-30 0.0042
item-31 0.0099
item-32 0.0036
For example if I wanted only top .1 ItemNumbers it would return
item-3 item-23 item-2 item-11 item-4 item-31
How do i get this column(sum of avg)
item-1 0.0235 0.0235
item-3 0.0203 0.0438
item-23 0.0153 0.0591
item-2 0.0149 0.074
item-11 0.0132 0.0872
item-4 0.0101 0.0973
item-31 0.0099 0.1072
item-28 0.0097 0.1169
item-6 0.0096 0.1265
item-13 0.0094 0.1359
item-7 0.0092 0.1451
item-26 0.0086 0.1537
item-5 0.0084 0.1621
item-10 0.0084 0.1705
item-20 0.0083 0.1788
item-27 0.0076 0.1864
item-9 0.0069 0.1933
item-8 0.0062 0.1995
item-15 0.0061 0.2056
item-12 0.0058 0.2114
item-22 0.0058 0.2172
item-21 0.0048 0.222
item-16 0.0047 0.2267
item-30 0.0042 0.2309
item-19 0.004 0.2349
item-17 0.0038 0.2387
item-32 0.0036 0.2423
item-14 0.0028 0.2451
item-24 0.0025 0.2476
item-25 0.0022 0.2498
item-18 0.0021 0.2519
item-29 0.0009 0.2528
Goal:
How do I generate that column mentioned above? The key is first I need to order by AverageOfTotal and SUM of row to row values up to a number then return those items.
You might be looking (guesstimate from the numbers you produced as example)
for something akin to
Select
ItemNumber,
AverageOfTotal,
(
select sum(AverageOfTotal)
from tbl
where AverageOfTotal >= yt.AverageOfTotal
) as summedAvgBiggerEqualThisOne
from tbl yt
where
(
select sum(AverageOfTotal)
from tbl
where AverageOfTotal >= yt.AverageOfTotal
) < 0.1
order by AverageOfTotal desc
The newer sql-server versions have I think a kind of running_tally function - not played with it yet - that look smarter (see Chris Macks answer ) .
For inspiration, look f.e. here: Calculate a Running Total in SQL Server
DDL:
CREATE TABLE tbl ( ItemNumber varchar(7), AverageOfTotal decimal(6,6));
INSERT INTO tbl ( ItemNumber , AverageOfTotal )
VALUES
('item-1', 0.0235), ('item-2', 0.0149), ('item-3', 0.0203), ('item-4', 0.0101),
('item-5', 0.0084), ('item-6', 0.0096), ('item-7', 0.0092), ('item-8', 0.0062),
('item-9', 0.0069), ('item-10', 0.0084), ('item-11', 0.0132), ('item-12', 0.0058),
('item-13', 0.0094), ('item-14', 0.0028), ('item-15', 0.0061), ('item-16', 0.0047),
('item-17', 0.0038), ('item-18', 0.0021), ('item-19', 0.004), ('item-20', 0.0083),
('item-21', 0.0048), ('item-22', 0.0058), ('item-23', 0.0153), ('item-24', 0.0025),
('item-25', 0.0022), ('item-26', 0.0086), ('item-27', 0.0076), ('item-28', 0.0097),
('item-29', 0.0009), ('item-30', 0.0042), ('item-31', 0.0099), ('item-32', 0.0036)
;
Result:
ItemNumber AverageOfTotal summedAvgBiggerEqualThisOne
item-1 0.0235 0.0235
item-3 0.0203 0.0438
item-23 0.0153 0.0591
item-2 0.0149 0.074
item-11 0.0132 0.0872
item-4 0.0101 0.0973
This will do it:
SELECT
ItemNumber
, AverageOfTotal
, SumOfAverageOfTotal
FROM
(
SELECT
ItemNumber
, AverageOfTotal
, SUM(AverageOfTotal) OVER (ORDER BY AverageOfTotal DESC) SumOfAverageOfTotal
FROM YourTable
) Q
WHERE SumOfAverageOfTotal < 0.1 -- or <=
FPL_ID AFSKEY FLIGHTNO FLIGHTYPE STAD AIRCRAFTTYPECODE TAILNO STANDCODE
1733285 4383931 UL 0314 A 2014-01-01 05:35:00.000 A343 4RADC C015
1733554 4382525 UL 0315 D 2014-01-01 08:25:00.000 A343 4RADC C015
1733385 4382929 AK 5107 A 2014-01-01 07:00:00.000 A320 9MAFB F086
1733484 4381571 AK 5212 D 2014-01-01 07:25:00.000 A320 9MAFB F086
I need help.
How to pair base on FLIGHTYPE A=Arrival and D=Departure into a single row ?
Just inner join the same table again. In the example below, f1 will contain arrivals and f2 departures.
select f1.*, f2.* -- replace with the list of columns you need
from flights f1
inner join flighs f2
on f1.FLIGHTNO = f2.FLIGHTNO
and f1.FPL_ID <> f2.FPL_ID
and f1.FLIGHTYPE = 'A' and f2.FLIGHTYPE = 'D'
I have a pyspark query which returns a WrappedArray:
det_port_arr =
vessel_arrival_depart_df.select(vessel_arrival_depart_df['DetectedPortArrival'])
det_port_arr.show(2, truncate=False)
det_port_arr.dtypes
The output is a DataFrame with a single column, but that column is a struct which contains an array of structs:
|DetectedPortArrival |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[WrappedArray([portPoi,5555,BEILUN [CNBEI],marinePort], [portPoi,5729,NINGBO [CNNBO],marinePort], [portPoi,5730,NINGBO PT [CNNBG],marinePort]),device,Moored]|
|null |
[('DetectedPortArrival',
'struct<poiMeta:array<struct<poiCategory:string,poiId:bigint,poiName:string,poiType:string>>,sourceType:string,statusType:string>')]
If I try to select the poiMeta member of the struct:
temp = vessel_arrival_depart_df.select(vessel_arrival_depart_df['DetectedPortArrival']['poiMeta'])
temp.show(truncate=False)
print type(temp)
I obtain
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|DetectedPortArrival.poiMeta |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[[portPoi,5555,BEILUN [CNBEI],marinePort], [portPoi,5729,NINGBO [CNNBO],marinePort], [portPoi,5730,NINGBO PT [CNNBG],marinePort]] |
|null |
Here are the data types:
temp.dtypes
('DetectedPortArrival.poiMeta',
'array<struct<poiCategory:string,poiId:bigint,poiName:string,poiType:string>>')]
But here's the problem: I don't seem to be able to query that column DetectedPortArrival.poiMeta:
df2 = temp.selectExpr("DetectedPortArrival.poiMeta")
df2.show(2)
AnalysisExceptionTraceback (most recent call last)
<ipython-input-46-c7f0041cffe9> in <module>()
----> 1 df2 = temp.selectExpr("DetectedPortArrival.poiMeta")
2 df2.show(3)
/opt/spark/spark-2.1.0-bin-hadoop2.4/python/pyspark/sql/dataframe.py in selectExpr(self, *expr)
996 if len(expr) == 1 and isinstance(expr[0], list):
997 expr = expr[0]
--> 998 jdf = self._jdf.selectExpr(self._jseq(expr))
999 return DataFrame(jdf, self.sql_ctx)
1000
/opt/spark/spark-2.1.0-bin-hadoop2.4/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in __call__(self, *args)
1131 answer = self.gateway_client.send_command(command)
1132 return_value = get_return_value(
-> 1133 answer, self.gateway_client, self.target_id, self.name)
1134
1135 for temp_arg in temp_args:
/opt/spark/spark-2.1.0-bin-hadoop2.4/python/pyspark/sql/utils.py in deco(*a, **kw)
67 e.java_exception.getStackTrace()))
68 if s.startswith('org.apache.spark.sql.AnalysisException: '):
---> 69 raise AnalysisException(s.split(': ', 1)[1], stackTrace)
70 if s.startswith('org.apache.spark.sql.catalyst.analysis'):
71 raise AnalysisException(s.split(': ', 1)[1], stackTrace)
AnalysisException: u"cannot resolve '`DetectedPortArrival.poiMeta`' given input columns: [DetectedPortArrival.poiMeta]; line 1 pos 0;\n'Project ['DetectedPortArrival.poiMeta]\n+- Project [DetectedPortArrival#268.poiMeta AS DetectedPortArrival.poiMeta#503]\n +- Project [asOf#263, vesselId#264, DetectedPortArrival#268, DetectedPortDeparture#269]\n +- Sort [asOf#263 ASC NULLS FIRST], true\n +- Project [smfPayloadData#1.paired.shipmentId AS shipmentId#262, smfPayloadData#1.timestamp.asOf AS asOf#263, smfPayloadData#1.paired.vesselId AS vesselId#264, smfPayloadData#1.paired.vesselName AS vesselName#265, smfPayloadData#1.geolocation.speed AS speed#266, smfPayloadData#1.geolocation.detectedPois AS detectedPois#267, smfPayloadData#1.events.DetectedPortArrival AS DetectedPortArrival#268, smfPayloadData#1.events.DetectedPortDeparture AS DetectedPortDeparture#269]\n +- Filter ((((cast(smfPayloadData#1.paired.vesselId as double) = cast(9776183 as double)) && isnotnull(smfPayloadData#1.paired.shipmentId)) && (length(smfPayloadData#1.paired.shipmentId) > 0)) && (isnotnull(smfPayloadData#1.paired.vesselId) && (isnotnull(smfPayloadData#1.events.DetectedPortArrival) || isnotnull(smfPayloadData#1.events.DetectedPortDeparture))))\n +- SubqueryAlias smurf_processed\n +- Relation[smfMetaData#0,smfPayloadData#1,smfTransientData#2] parquet\n"
Any suggestions as to how to query that column?
cant you just select the column based on their index? Something like
temp.select(temp.columns[0]).show()
Best Regards