Inserting data from SQL Server to Postgres - UTF8 0x00 error - sql-server

I am using PySpark to read data from SQL Server and write it to a Postgres DB on AWS.
The INSERT to Postgres fails with this error
ERROR: invalid byte sequence for encoding "UTF8": 0x00
I have been searching for a fix but no luck. That statements I have used to find char(0) say there are none in any of the columns. I'm using this
cast(ExportName AS varchar) like '%' + char(0) +'%'
REPLACE doesn't work.
I tried changing the encoding in pyspark script with
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
but still getting the error.
Any help is greatly appreciated.
Edit: Here is the full pyspark code
# pyspark --jars /deployment/mssql-jdbc-9.2.0.jre8.jar,/deployment/postgresql-42.2.11.jar --num-executors 3 --executor-cores 8 --executor-memory 16g
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
from pyspark import SparkConf, SparkContext
from pyspark.sql import SparkSession
conf = SparkConf()
conf.setMaster("local").setAppName("My app")
sc = SparkContext.getOrCreate(conf=conf)
spark = SparkSession(sc)
ms_url = "jdbc:sqlserver://hostname;instanceName=instance;databaseName=database;"
ms_user = "username"
ms_password = "password"
mssql_query = ''' (SELECT *
FROM DBO.TBL_SENSORS
WHERE TYPE IN ('S','C')) query'''
jdbcDF = spark.read.format("jdbc") \
.option("url", ms_url) \
.option("dbtable", mssql_query) \
.option("user", ms_user) \
.option("password", ms_password) \
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
.load()
pg_url = "jdbc:postgresql://hostname:5432/database"
pg_user = "pg_username"
pg_password = "pg_password"
pg_table = "SCO.TBL_SENSORS"
jdbcDF.write.format("jdbc") \
.option("driver", "org.postgresql.Driver") \
.option("url", pg_url)\
.option("user", pg_user) \
.option("password", pg_password) \
.option("dbtable", pg_table) \
.mode("append") \
.save()

Related

Is X11 required for adding text overlay?

I'm trying to run the below command in a script and it seems it's not adding any text layout to the video, I'm sure I didn't it before and it was fine.
My question is, do I need to set up X11 environment in order to use dynamictext or text filters?
Thanks in advance.
/usr/bin/melt \
"/var/www/html/test/nkLcBPkebo/t-1.mp4" \
-audio-track \
"/var/www/html/test/nkLcBPkebo/t-1_sound.mp4" \
-attach-track \
"text:This is my best video" \
-0 \
in=0 out=0 fgcolour="#004fed" bgcolour=0 olcolour="#fff200" outline=3 pad="50x0" size=80 weight=700 style="italic" halign="center" valign="top" family="Ubuntu" \
-profile hdv_720_25p -progress \
-consumer avformat:"/var/www/html/test/nkLcBPkebo/1.mp4" \
vcodec="libx264" vb="5000k" acodec="aac" ab="128k" frequency=44100 deinterlace=1
I think I found the issue! it's was all about the order of the parameters :)
Having the text right after the video (before the audio) should fix it
/usr/bin/melt \
"/var/www/html/test/nkLcBPkebo/t-1.mp4" \
-attach-track \
"text:This is my best video" \
-0 \
in=0 out=0 fgcolour="#004fed" bgcolour=0 olcolour="#fff200" outline=3 pad="50x0" size=80 weight=700 style="italic" halign="center" valign="top" family="Ubuntu" \
-audio-track \
"/var/www/html/test/nkLcBPkebo/t-1_sound.mp4" \
-profile hdv_720_25p -progress \
-consumer avformat:"/var/www/html/test/nkLcBPkebo/1.mp4" \
vcodec="libx264" vb="5000k" acodec="aac" ab="128k" frequency=44100 deinterlace=1

How to speed up writing from Spark dataframe into SQL Server using Pyspark?

It's taking about 15 minutes to insert a 500MB ndjson file with 100,000 rows into MS SQL Server table. I am running Spark locally on a machine with good specs - 32GB RAM, i9-10885H CPU with 8 cores. I doubt that the machine is being used to its full capabilities. Here is what I am trying.
master = "local[16]"
conf = SparkConf() \
.setAppName(appName) \
.set("spark.driver.memory", "16g") \
.set("spark.executor.memory", "1g") \
.set('spark.executor.cores', '5') \
.set("spark.driver.extraClassPath","./mssql-jdbc-9.2.1.jre11.jar") \
.setMaster(master)
sc = SparkContext(conf=conf)
sqlContext = SQLContext(sc)
spark = sqlContext.sparkSession
def insert_into_ss(start):
for i in range(start, len(files)):
item = files[i]
print(item)
start = datetime.now()
spark_df = sqlContext.read.json(upload_dir + '/' +item)
spark_df = spark_df.select([col(c).cast("string") for c in spark_df.columns])
print('Casting time', datetime.now() - start)
spark_df.write.mode("append") \
.format("jdbc") \
.option("url", url) \
.option("dbtable", table) \
.option("batchsize", 20000) \
.option("reliabilityLevel", 'NO_DUPLICATES') \
.option("tableLock", 'true') \
.option("numPartitions", 16) \
.option("bulkCopyTimeout", 600000) \
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
.save()
end = datetime.now()
print(end-start)
insert_into_ss()

Spark SQL JDBC empty dataframe

I'm trying to load the query result from one table to another. It connects fine and execute a query to get the metadata but no data is returned.
from pyspark.sql import SQLContext, Row, SparkSession
spark = SparkSession.builder.config("spark.driver.extraClassPath", "C:\\spark\SQL\\sqljdbc_7.0\\enu\\mssql-jdbc-7.0.0.jre10.jar").getOrCreate()
SQL = "Select [InvoiceID],[CustomerID],[BillToCustomerID],[OrderID],[DeliveryMethodID],[ContactPersonID],[AccountsPersonID],[SalespersonPersonID],[PackedByPersonID],[InvoiceDate],[CustomerPurchaseOrderNumber],[IsCreditNote],[CreditNoteReason],[Comments],[DeliveryInstructions],[InternalComments],[TotalDryItems],[TotalChillerItems],[DeliveryRun],[RunPosition],[ReturnedDeliveryData],[ConfirmedDeliveryTime],[ConfirmedReceivedBy],[LastEditedBy],[LastEditedWhen] FROM [Sales].[Invoices]"
pgDF = spark.read \
.format("jdbc") \
.option("url", "jdbc:sqlserver://Localhost") \
.option("query", SQL) \
.option("user", "dp_admin") \
.option("Database", "WideWorldImporters") \
.option("password", "password") \
.option("fetchsize", 1000) \
.load(SQL)
pgDF.write \
.format("jdbc") \
.option("url", "jdbc:sqlserver://Localhost") \
.option("dbtable", "wwi.Sales_InvoiceLines") \
.option("user", "dp_admin") \
.option("Database", "DW_Staging") \
.option("password", "password") \
.option("mode", "overwrite")
Looking at the sql server profiler :
exec sp_executesql N'SELECT * FROM (Select [InvoiceID],[CustomerID],[BillToCustomerID],[OrderID],[DeliveryMethodID],[ContactPersonID],[AccountsPersonID],[SalespersonPersonID],[PackedByPersonID],[InvoiceDate],[CustomerPurchaseOrderNumber],[IsCreditNote],[CreditNoteReason],[Comments],[DeliveryInstructions],[InternalComments],[TotalDryItems],[TotalChillerItems],[DeliveryRun],[RunPosition],[ReturnedDeliveryData],[ConfirmedDeliveryTime],[ConfirmedReceivedBy],[LastEditedBy],[LastEditedWhen] FROM [Sales].[Invoices]) __SPARK_GEN_JDBC_SUBQUERY_NAME_0 WHERE 1=0'
the where 1= 0 gets added and no data is returned, why and how to remove it?

can not build Qt5 framework by Yocto project for qemuarm

I'm trying to run Qt5 framework through eglfs instead of X11 or wayland.
I'm trying to install Qt5 for qemuarm emualting Raspberry pi 3 based on yocto Rocko.
This is my bblayers.conf:
# POKY_BBLAYERS_CONF_VERSION is increased each time build/conf/bblayers.conf
# changes incompatibly
POKY_BBLAYERS_CONF_VERSION = "2"
BBPATH = "${TOPDIR}"
SRCPATH = "/home/yocto/yocto"
BBFILES ?= ""
BBLAYERS ?= " \
${SRCPATH}/meta \
${SRCPATH}/meta-poky \
${SRCPATH}/meta-openembedded/meta-oe \
${SRCPATH}/meta-openembedded/meta-multimedia \
${SRCPATH}/meta-openembedded/meta-networking \
${SRCPATH}/meta-openembedded/meta-perl \
${SRCPATH}/meta-openembedded/meta-python \
${SRCPATH}/meta-qt5 \
${SRCPATH}/meta-raspberrypi \
${SRCPATH}/meta-security \
and this is my local.conf:
MACHINE ??= "qemuarm"
DL_DIR ?= "${TOPDIR}/../downloads"
DISTRO ?= "poky"
PACKAGE_CLASSES ?= "package_deb"
EXTRA_IMAGE_FEATURES ?= "debug-tweaks"
USER_CLASSES ?= "buildstats image-mklibs image-prelink"
PATCHRESOLVE = "noop"
BB_DISKMON_DIRS = "\
STOPTASKS,${TMPDIR},1G,100K \
STOPTASKS,${DL_DIR},1G,100K \
STOPTASKS,${SSTATE_DIR},1G,100K \
STOPTASKS,/tmp,100M,100K \
ABORT,${TMPDIR},100M,1K \
ABORT,${DL_DIR},100M,1K \
ABORT,${SSTATE_DIR},100M,1K \
ABORT,/tmp,10M,1K"
LICENSE_FLAGS_WHITELIST = "commercial"
CONF_VERSION = "1"
PREFERRED_VERSION_linux-raspberrypi = "4.%"
DISTRO_FEATURES_remove = "x11 wayland"
DISTRO_FEATURES_append = " systemd opengl pam ${DISTRO_FEATURES_LIBC}"
VIRTUAL-RUNTIME_init_manager = "systemd"
EXTRA_IMAGE_FEATURES += "package-management splash"
INHERIT+="toaster buildhistory"
CORE_IMAGE_EXTRA_INSTALL += "openssh"
ENABLE_UART="1"
#PACKAGECONFIG_append_qtbase = " accessibility eglfs fontconfig gles2 linuxfb"
################### QT ######################
QT_DEV_TOOLS = " \
qtbase-dev \
qtbase-mkspecs \
qtbase-plugins \
qtbase-tools \
qtserialport-dev \
qtserialport-mkspecs \
"
QT_TOOLS = " \
qtbase \
qtserialport \
"
FONTS = " \
fontconfig \
fontconfig-utils \
ttf-bitstream-vera \
"
TSLIB = " \
tslib \
tslib-conf \
tslib-calibrate \
tslib-tests \
tspress \
"
QT5_PKGS = " \
qt3d \
qtcharts \
qtdeclarative \
qtdeclarative-plugins \
qtdeclarative-qmlplugins \
qtgraphicaleffects \
qtlocation-plugins \
qtmultimedia \
qtquickcontrols2 \
qtsensors-plugins \
qtserialbus \
qtsvg \
qtwebsockets-qmlplugins \
qtvirtualkeyboard \
qtxmlpatterns \
"
QML_APPS = " \
qqtest \
"
CORE_IMAGE_EXTRA_INSTALL += " \
${FONTS} \
${QT_TOOLS} \
${QT5_PKGS} \
cinematicexperience \
"
I'm trying to build this image bitbake rpi-hwup-image
the problem is with qtbase, it fails with this error:
| ERROR: Feature 'opengl-desktop' was enabled, but the pre-condition '(config.win32 && !config.winrt && !features.opengles2 && (config.msvc || libs.opengl))
| || (!config.watchos && !config.win32 && libs.opengl)' failed.
|
| ERROR: The OpenGL functionality tests failed!
| You might need to modify the include and library search paths by editing QMAKE_INCDIR_OPENGL[_ES2],
| QMAKE_LIBDIR_OPENGL[_ES2] and QMAKE_LIBS_OPENGL[_ES2] in the mkspec for your platform.
Update
This problem is solved by uncommenting PACKAGECONFIG_append_qtbase and it has a typo so, it's been updated to be PACKAGECONFIG_append_pn-qtbase.
I added those lines too:
PACKAGECONFIG_append_pn-qemu-native = " sdl"
PACKAGECONFIG_append_pn-nativesdk-qemu = " sdl".
I comment out this line LICENSE_FLAGS_WHITELIST = "commercial".
but it fails again at qtbase build by this error (this is the tail of the log file) (I deleted the tmp folder and started bitbake rpi-hwup-image again but it went to the same error)
| cd windowflags/ && ( test -e Makefile || /home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/recipe-sysroot-native/usr/bin/qt5/qmake -o Makefile /home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/git/examples/widgets/widgets/windowflags/windowflags.pro -qtconf /home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/build/bin/qt.conf ) && make -f Makefile
| make[4]: Entering directory '/home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/build/examples/widgets/widgets/windowflags'
| compiling /home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/git/examples/widgets/widgets/windowflags/controllerwindow.cpp
| compiling /home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/git/examples/widgets/widgets/windowflags/previewwindow.cpp
| linking wiggly
| make[4]: Leaving directory '/home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/build/examples/widgets/widgets/wiggly'
| compiling /home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/git/examples/widgets/widgets/windowflags/main.cpp
| linking validators
| generating .moc/moc_predefs.h
| moc /home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/git/examples/widgets/widgets/windowflags/controllerwindow.h
| make[4]: Leaving directory '/home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/build/examples/widgets/widgets/validators'
| moc /home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/git/examples/widgets/widgets/windowflags/previewwindow.h
| compiling .moc/moc_controllerwindow.cpp
| compiling .moc/moc_previewwindow.cpp
| linking windowflags
| make[4]: Leaving directory '/home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/build/examples/widgets/widgets/windowflags'
| make[3]: Leaving directory '/home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/build/examples/widgets/widgets'
| make[2]: Leaving directory '/home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/build/examples/widgets'
| make[1]: Leaving directory '/home/yocto/yocto/build/RP3_Qt/tmp/work/armv5e-poky-linux-gnueabi/qtbase/5.9.3+gitAUTOINC+4d8ae444c2-r0/build/examples'
| ERROR: oe_runmake failed
| WARNING: exit code 1 from a shell command.
I found the solution according to this page.
The configuration part -at this page- says something about ommiting the tests part at this option -nomake tests, it's the same part that fails at the compiling stage of qtbase.
so after cleaning qtbase and adding this part to my local.conf
PACKAGECONFIG_remove_pn-qtbase = " tests gl"
gl is omitted too because we want to build qtbase for eglfs not for desktop.

#define C usage, taking multiple values

In the past whenever I came across #define it was used like
#define MOD 1000000007
In the case above all instances of MOD in the code was replaced by 1000000007.
I am new to open source development and was looking at several video filters of VLC media player. It has several uses of #define as-
//example1
#define MSG_LONGTEXT N_( \
"Marquee text to display. " \
"(Available format strings: " \
"Time related: %Y = year, %m = month, %d = day, %H = hour, " \
"%M = minute, %S = second, ... " \
"Meta data related: $a = artist, $b = album, $c = copyright, " \
"$d = description, $e = encoded by, $g = genre, " \
"$l = language, $n = track num, $p = now playing, " \
"$r = rating, $s = subtitles language, $t = title, "\
"$u = url, $A = date, " \
"$B = audio bitrate (in kb/s), $C = chapter," \
"$D = duration, $F = full name with path, $I = title, "\
"$L = time left, " \
"$N = name, $O = audio language, $P = position (in %), $R = rate, " \
"$S = audio sample rate (in kHz), " \
"$T = time, $U = publisher, $V = volume, $_ = new line) ")
//example 2
#define POSY_TEXT N_("Y offset")
//example 3
#define TIMEOUT_LONGTEXT N_("Number of milliseconds the marquee must remain " \
"displayed. Default value is " \
"0 (remains forever).")
can somebody explain these examples with respect to
#define
and software development both or provide some resources?
It's exactly the same, the only addition is that \ marks the continuation of the current line in the next one. It's there for readability reasons.
For example:
#define TIMEOUT_LONGTEXT N_("Number of milliseconds the marquee must remain " \
"displayed. Default value is " \
"0 (remains forever).")
is equivalent to
#define TIMEOUT_LONGTEXT N_("Number of milliseconds the marquee must remain " "displayed. default value is " "0 (remains forever).")
So whenever TIMEOUT_LONGTEXT appears in the code, the preprocessor will replace it with N_("whatever").

Resources