I am using 2.1 release of festival. I was able to install and use 172M voice with
(voice_cmu_us_slt_arctic_clunits)
The quality has been significantly improved but far from desired. I believe generation still uses a lot of defaults. Is it possible to tune this further (e.g. close to the quality of qwiki.com engine)? I understand that I need a proper combination of
Synthesis method
Intonation/duration settings
Audio output parameters
xx ?
but it is very difficult to find all the details (the progress is quite slow).
Any tips, links to tutorials/docs (old version but provides some theory overview) or scheme snippets are appreciated.
PS
Please note that so far I am not interested in the tuning of the algorithms themselves (e.g. training the voice model with sphinx).
To generate speech I use commands like
(SayText "This is a short introduction ...")
and
./text2wave -eval '(voice_cmu_us_slt_arctic_clunits)' TEXT > output.wav
Related
I want create and train own model. If not possible to create own, any way to train other models? (Yolo, MobileNet, Coco?)
There some requirements:
- I know only JS (tried Python, no, i don't continue, i can't, Python best language, but i can't, sorry)
- Performance, at least 24 FPS, like real-time detection
- Own dataset freedom (like file /dogs/pitbull/01.png)
I tried Python, but i don't/can't work in Python due of 10 years experience with JS doesn't lets me use Python
Thanks for everyone for help
This answer is written by an IBMer.
If you want to build a model like the one described (image classification/object detection) without having to deal with python - and you want to use it with javascript in a browser, you can try the tooling available at https://cloud.annotations.ai/. It works with a IBM Cloud account but you can stay go on with the free tier - so you just need to register, at least to do your first experiments.
You'll find here https://github.com/cloud-annotations code and boilerplate to use your model on different platform.
It is not an advanced tool but it enables you to put your hands on the topic.
Whats the best approach of using existing NLP tools in english with another language ex.spanish ?
That's an awfully broad question, and you'd need to provide some more pointers. However, if you're interested in general research on the topic, you can try Hana, Feldman, Brew (2004) "Tagging Russian using Czech morphology" and Resnik's 2004 "Using bilingual text for monolingual annotation" and start from there.
In general, you'd want to have a bicorpus (say, English/Swedish). Then establish mappings using alignment (that's a common topic in machine translation with many established results.)
You can then tag the English side, and use the mapping to "translate" these mappings into the Swedish side. Then you can train the same tool that created the mappings on the English side using the newly annotated Swedish corpus.
It goes without saying that you'll lose quite a bit of quality and that this technique only works for supervised methods. You should probably try to find properly annotated Swedish corpora and tools. There are a few out there.
Is there any tool, similar to codepad, writing code in C language that I can share my code with a group and my group can make changes and simultaneous views in real time editing?
I can't tell you enough that this is going to make your work more difficult if you're planning on using this for anything other than something like a code review. However, it's called a real-time collaborative editor. There are a ton of them. I used one on linux a while back that I can't remember the name of, but in the mean-time, let wikipedia start you off...
http://en.wikipedia.org/wiki/Collaborative_real-time_editor
Edit:
The tool I used on Linux that worked well was called Gobby.
There are a bunch of others in this question on SO Real time tool for collaborative coding
Sorry for resurrecting an old question but I thought I should share this.
I usually use Collab.Center (http://collab.center). Some features I like about it better than others are:
Online, real-time collaborative coding
Support for a lot of languages (40+, I think) (EX: C, C++, Java, HTML/CSS/JS, PHP, etc)
Text and Video (Webcam) chat (Requires Sign-In)
Syntax highlighting, auto-closing brackets, matching brackets, etc.
Ability to manage all your documents (Requires Sign-In)
Private documents (Requires Sign-In)
I think it would be great for you and your group, if you haven't already found an alternative.
We are in the process of defining our software development process and wanted to get some feed back from the group about this topic.
Our team is spread out - US, Canada and India - and I would like to put into place some simple standard rules that all teams will apply to their code.
We make use of Clear Case/Quest and RAD
I have been looking at PMD, CPP, checkstyle and FindBugs as a start.
My thought is to just put these into ANT and have the developers run these manually. I realize doing this you have to have some trust in that each developer will do this.
The other thought is to add in some builders in to the IDE which would run a subset of the rules (keep the build process light) and then add another set (heavy) when they check in the code.
Some other ideals is to make use of something like Cruse Control and have it set up to run these static analysis tools along with the unit test when ever Clear Case/Quest is idle.
Wondering if others have done this and if it was successfully or can provide lessons learned.
We have:
ClearCase used with Hudson for any "heavy" static analysis step
Eclipse IDE with the tools you mentioned integrated with a smaller set of rules
Note: we haven't really managed to make replica works with our different user bases (US-Europe-Hong-Kong), and we are using CCRC instead of multi-sites.
ClearCase being mainly used in Europe, the analysis step takes place during the night there (UMT time), and use snapshot views to make sure it goes as quickly as possible (a dynamic view involves too much network traffic when accessing large files).
I'd use hudson to run static analysis on scm changes if your code base is not too large, or on periodic builds if it is.
OK, i can't resist... If you team is spread out, why in the world would you use clearcase? As someone who had to use that, when our company switched to Mercurial the team velocity improved immensely. That multi-site junk is just awful.
I'm communicating with a logic analyzer (HP 1660A) over RS232. I issue a command which tells the analyzer to print screen its display and send it over to the controller (my pc) through serial communication. I'm saving the result (which is usually abut 25kB) to my computer and I would like to view it as a TIFF or other format. The problem is that the response from the analyzer comes in PCL format, therefore suitable to be sent to a printer and printed directly, but not to be opened as an image. I have tried a few PCL to image converters to do the job, I found one which does it properly, however I've used the trial version and I am reluctant to purchase it. I've given you the background of my labour. I would appreciate any kind of help, a reference to the commands in pcl 1 and what should I do in order to extract the data and format it properly from the PCL file. I have no experience with PCL and image processing whatsoever, so please, give me a hand here. Thank you.
P.S. I've obtained the PCL file from the analyzer, both in C# and matlab... I have one slight problem in C# with the serial port control, some images have some uninterpreted characters in the image, when using the above converters. I say all these because I need an algorithm or some indications, no matter the programming language, so please feel free to post.
PCL is complex to read. There are only a handful of tools out there that do a good job of this. We have lots of PCL expertise and still often look to other to supply conversion to PDF and other formats. If the PCL is quite simple, that is, just text, a few fonts, and a graphic or two, a couple of RegEx commands could deal with the extraction of the text and then you could mock up a new document using whatever tools you wish.
Looking at these files in stackoverflow might be tough. If you can get them on an ftp and post a link I can take a quick look and post my findings/thoughts here. The other option is to look to an outside tool. There are a few we've had success with. Our needs are broad so I've settled on one that works the best with many different PCL streams (some PCL coding is better than others). As you are dealing with a known quantity of PCL you may have a few options. Here are a few we've used and had some success with (in order of usefulness to us)
PCLWorks by PageTech (they have a GUI viewer and complete SDK)
VeryPDF PCL Converter (command line tool)
SwiftView
There are others, and even an opensource variant of Ghostscript that handles PCL (we've never had much luck as the PCL we use often contains very custom fonts, symbol sets, and tons of macros which seem to choke it.
GhostPCL
EDIT: Most recently we've been working with LincPDF (http://www.lincolnco.com/). This is also an excellent product with has one big benefit, deployment is simple. Some of the other tools have complex software installations. This solution is very easy for us to deploy as a feature in an application. It's also faster then any tools we've tested to date (at least with the PCL that we generate from our apps which is quite complex as they include specialized fonts and macros).
According to the spec sheet for the HP 1660 (pdf) series can send the TIFF,PCX and postscript.
Wouldn't it be easier to use TIFF?
The project was put on hold for a while, but I would like to offer a complete and usable solution.
#Adrian
You can save the image to a floppy disk, I've done that, saved it as TIFF and everything worked fine. Unfortunately, it sends only PCL through RS232. The idea to save the print screen over serial communication was to avoid using too much the floppy disk, which the device uses in order to boot.
#Douglas
Thank you for your elaborate answer. I'll take a look at the indicated tools, however, my desire is to offer a complete front-end solution, which yields directly the graphic. I've put some files from my tests here in order to see the complexity of the PCL constructions. Do you have any knowledge of a possible API that I could integrate into my application, which can parse the file and interpret the PCL?
Regards,
Cosmin
We capture the serial input via a serial spooler that watches COM1:. It's called SSpool.exe. It redirects the PCL as input to PCLXForm. PCLXForm converts it into any raster format (TIFF, JPG, PDF, BMP, etc.) However, we can also extract the text during the conversion and we can extract individual raster objects from the PCL for re-arrangement in the downstream application. Our pricing model is positioned for licensee's that need to convert up to 50,000 pages of invoices into indexed PDF's per month. However, this type of application normally requires a custom license in order to get our pricing down to the level required. In order to do so, we often have to restrict our product to convert unlimited files, but only up to the 20th page within any one PCL print file. That provides enough page volume and gives us the ability to reduce the pricing per unit. To demo, you would need the PCLTool SDK.