I'm stuck in my own wait loop and not really sure why. The function takes an input and output channel, then takes each item in the channel, executes an http.GET for the content and pulls the tag from the html.
The process to GET and scrape is inside a go routine, and I've set up a wait group (innerWait) to be sure that I've processed everything before closing the output channel.
func (fp FeedProducer) getTitles(in <-chan feeds.Item,
out chan<- feeds.Item,
wg *sync.WaitGroup) {
defer wg.Done()
var innerWait sync.WaitGroup
for item := range in {
log.Infof(fp.c, "Incrementing inner WaitGroup.")
innerWait.Add(1)
go func(item feeds.Item) {
defer innerWait.Done()
defer log.Infof(fp.c, "Decriment inner wait group by defer.")
client := urlfetch.Client(fp.c)
resp, err := client.Get(item.Link.Href)
log.Infof(fp.c, "Getting title for: %v", item.Link.Href)
if err != nil {
log.Errorf(fp.c, "Error retriving page. %v", err.Error())
return
}
if strings.ToLower(resp.Header.Get("Content-Type")) == "text/html; charset=utf-8" {
title := fp.scrapeTitle(resp)
item.Title = title
} else {
log.Errorf(fp.c, "Wrong content type. Received: %v from %v", resp.Header.Get("Content-Type"), item.Link.Href)
}
out <- item
}(item)
}
log.Infof(fp.c, "Waiting for title pull wait group.")
innerWait.Wait()
log.Infof(fp.c, "Done waiting for title pull.")
close(out)
}
func (fp FeedProducer) scrapeTitle(request *http.Response) string {
defer request.Body.Close()
tokenizer := html.NewTokenizer(request.Body)
var titleIsNext bool
for {
token := tokenizer.Next()
switch {
case token == html.ErrorToken:
log.Infof(fp.c, "Hit the end of the doc without finding title.")
return ""
case token == html.StartTagToken:
tag := tokenizer.Token()
isTitle := tag.Data == "title"
if isTitle {
titleIsNext = true
}
case titleIsNext && token == html.TextToken:
title := tokenizer.Token().Data
log.Infof(fp.c, "Pulled title: %v", title)
return title
}
}
}
Log content looks like this:
2015/08/09 22:02:10 INFO: Revived query parameter: golang
2015/08/09 22:02:10 INFO: Getting active tweets from the last 7 days.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Waiting for title pull wait group.
2015/08/09 22:02:10 INFO: Getting title for: http://devsisters.github.io/goquic/
2015/08/09 22:02:10 INFO: Pulled title: GoQuic by devsisters
2015/08/09 22:02:10 INFO: Getting title for: http://whizdumb.me/2015/03/03/matching-a-string-and-extracting-values-using-regex/
2015/08/09 22:02:10 INFO: Pulled title: Matching a string and extracting values using regex | Whizdumb's blog
2015/08/09 22:02:10 INFO: Getting title for: https://www.reddit.com/r/golang/comments/3g7tyv/dropboxs_infrastructure_is_go_at_a_huge_scale/
2015/08/09 22:02:10 INFO: Pulled title: Dropbox's infrastructure is Go at a huge scale : golang
2015/08/09 22:02:10 INFO: Getting title for: http://dave.cheney.net/2015/08/08/performance-without-the-event-loop
2015/08/09 22:02:10 INFO: Pulled title: Performance without the event loop | Dave Cheney
2015/08/09 22:02:11 INFO: Getting title for: https://github.com/ccirello/sublime-gosnippets
2015/08/09 22:02:11 INFO: Pulled title: ccirello/sublime-gosnippets · GitHub
2015/08/09 22:02:11 INFO: Getting title for: https://medium.com/iron-io-blog/an-easier-way-to-create-tiny-golang-docker-images-7ba2893b160?mkt_tok=3RkMMJWWfF9wsRonuqTMZKXonjHpfsX57ewoWaexlMI/0ER3fOvrPUfGjI4ATsNrI%2BSLDwEYGJlv6SgFQ7LMMaZq1rgMXBk%3D&utm_content=buffer45a1c&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer
2015/08/09 22:02:11 INFO: Pulled title: An Easier Way to Create Tiny Golang Docker Images — Iron.io Blog — Medium
I can see that I'm getting to the innerWait.Wait() command based on the logs, which also tells me that the inbound channel has been closed on the other side of the pipe.
It would appear that the defer statements in the anonymous function are not being called, as I can't see the deferred log statement printed anywhere. But I can't for the life of me tell why as all code in that block appears to execute.
Help is appreciated.
The goroutines are stuck sending to out at this line:
out <- item
The fix is to start a goroutine to receive on out.
A good way to debug issues like this is to dump the goroutine stacks by sending the process a SIGQUIT.
Related
I was learning Go and Mongodb, currently using the alpha official mongodb driver. Although it is in alpha, it is quite functional for basic usage I think.
But I got an interesting issue on time conversion in this db driver.
Basically, I created a custom typed struct object, and marshaled it to bson document, and then convert the bson document back to struct object.
//check github.com/mongodb/mongo-go-driver/blob/master/bson/marshal_test.go
func TestUserStructToBsonAndBackwards(t *testing.T) {
u := user{
Username: "test_bson_username",
Password: "1234",
UserAccessibility: "normal",
RegisterationTime: time.Now(), //.Format(time.RFC3339), adding format result a string
}
//Struct To Bson
bsonByteArray, err := bson.Marshal(u)
if err != nil {
t.Error(err)
}
//.UnmarshalDocument is the same as ReadDocument
bDoc, err := bson.UnmarshalDocument(bsonByteArray)
if err != nil {
t.Error(err)
}
unameFromBson, err := bDoc.LookupErr("username")
//so here the binding is working for bson object too, the bind field named username ratherthan Username
if err != nil {
t.Error(err)
}
if unameFromBson.StringValue() != "test_bson_username" {
t.Error("bson from user struct Error")
}
//Bson Doc to User struct
bsonByteArrayFromDoc, err := bDoc.MarshalBSON()
if err != nil {
t.Error(err)
}
var newU user
err = bson.Unmarshal(bsonByteArrayFromDoc, &newU)
if err != nil {
t.Error(err)
}
if newU.Username != u.Username {
t.Error("bson Doc to user struct Error")
}
//here we have an issue about time format.
if newU != u {
log.Println(newU)
log.Println(u)
t.Error("bson Doc to user struct time Error")
}
}
However since my struct object has a time field, the result struct object contains a less accurate time value than the original. Then the comparison is failed.
=== RUN TestUserStructToBsonAndBackwards
{test_bson_username 1234 0001-01-01 00:00:00 +0000 UTC 2018-08-28 23:56:50.006 +0800 CST 0001-01-01 00:00:00 +0000 UTC normal }
{test_bson_username 1234 0001-01-01 00:00:00 +0000 UTC 2018-08-28 23:56:50.006395949 +0800 CST m=+0.111119920 0001-01-01 00:00:00 +0000 UTC normal }
--- FAIL: TestUserStructToBsonAndBackwards (0.00s)
model.user_test.go:67: bson Doc to user struct time Error
So I would like to ask many questions from this.
How to compare time properly in this case ?
What's the best way to store time in database to avoid such precision issue ? I think the time in database should not be a string.
is this a db driver bug ?
Times in BSON are represented as UTC milliseconds since the Unix epoch (spec). Time values in Go have nanosecond precision.
To round trip time.Time values through BSON marshalling, use times truncated to milliseconds since the Unix epoch:
func truncate(t time.Time) time.Time {
return time.Unix(0, t.UnixNano()/1e6*1e6)
}
...
u := user{
Username: "test_bson_username",
Password: "1234",
UserAccessibility: "normal",
RegisterationTime: truncate(time.Now()),
}
You can also use the Time.Truncate method:
u := user{
Username: "test_bson_username",
Password: "1234",
UserAccessibility: "normal",
RegisterationTime: time.Now().Truncate(time.Millisecond),
}
This approach relies on the fact that Unix epoch and Go zero time differ by a whole number of milliseconds.
You've correctly identified that the issue is one of precision.
MongoDB's Date type is "a 64-bit integer that represents the number of milliseconds...".
Golang's time.Time type "represents an instant in time with nanosecond precision".
As such, if you compare these respective values as golang types you will only get equivalence if the golang Time has millisecond resolution (e.g. zeroes for micro- and nanosecond places).
For example:
gotime := time.Now() // Nanosecond precision
jstime := gotime.Truncate(time.Millisecond) // Milliseconds
gotime == jstime // => likely false (different precision)
isoMillis := "2006-01-02T15:04:05.000-0700Z"
gomillis := gotime.Format(isoMillis)
jsmillis := jstime.Format(isoMillis)
gomillis == jsmillis // => true (same precision)
I'm having issues setting the timeout in go-mssqldb
This is my current connection string:
sqlserver://user:password#server?timeout=1m30s
I can connect just fine, run queries etc. but I keep timing out at the default value of 30 seconds.
I'm referencing the documentation here.
What am I missing?
import (
"database/sql"
_ "github.com/denisenkom/go-mssqldb"
)
func main(){
db, err := sql.Open("mssql", "sqlserver://user:password#server?timeout=1m30s")
if err != nil{
panic(err)
}
_, err = db.Exec("run query that takes longer than 30 seconds")
if err != nil{
panic(err)
}
// panic at 30 seconds...
// panic: read tcp {my ip}->{server ip}: i/o timeout
}
I was referencing the wrong documentation initially. To format the url see the following:
"sqlserver://user:password#server?connection+timeout=90"
I followed the tutorial from https://medium.com/#chvanikoff/phoenix-react-love-story-reph-1-c68512cfe18 and developed an application but with different versions.
elixir - 1.3.4
phoenix - 1.2.1
poison - 2.0
distillery - 0.10
std_json_io - 0.1
The application ran successfully when running locally.
Bur when created a mix release(MIX_ENV=prod mix release --env=prod --verbose) and ran rel/utopia/bin/utopia console(the otp application name was :utopia), I ran into error
Interactive Elixir (1.3.4) - press Ctrl+C to exit (type h() ENTER for help)
14:18:21.857 request_id=idqhtoim2nb3lfeguq22627a92jqoal6 [info] GET /
panic: write |1: broken pipe
goroutine 3 [running]:
runtime.panic(0x4a49e0, 0xc21001f480)
/usr/local/Cellar/go/1.2.2/libexec/src/pkg/runtime/panic.c:266 +0xb6
log.(*Logger).Panicf(0xc210020190, 0x4de060, 0x3, 0x7f0924c84e38, 0x1, ...)
/usr/local/Cellar/go/1.2.2/libexec/src/pkg/log/log.go:200 +0xbd
main.fatal_if(0x4c2680, 0xc210039ab0)
/Users/alco/extra/goworkspace/src/goon/util.go:38 +0x17e
main.inLoop2(0x7f0924e0c388, 0xc2100396f0, 0xc2100213c0, 0x7f0924e0c310, 0xc210000000, ...)
/Users/alco/extra/goworkspace/src/goon/io.go:100 +0x5ce
created by main.wrapStdin2
/Users/alco/extra/goworkspace/src/goon/io.go:25 +0x15a
goroutine 1 [chan receive]:
main.proto_2_0(0x7ffce6670101, 0x4e3e20, 0x3, 0x4de5a0, 0x1, ...)
/Users/alco/extra/goworkspace/src/goon/proto_2_0.go:58 +0x3a3
main.main()
/Users/alco/extra/goworkspace/src/goon/main.go:51 +0x3b6
14:18:21.858 request_id=idqhtoim2nb3lfeguq22627a92jqoal6 [info] Sent 500 in 1ms
14:18:21.859 [error] #PID<0.1493.0> running Utopia.Endpoint terminated
Server: 127.0.0.1:8080 (http)
Request: GET /
** (exit) an exception was raised:
** (Protocol.UndefinedError) protocol String.Chars not implemented for {#PID<0.1467.0>, :result, %Porcelain.Result{err: nil, out: {:send, #PID<0.1466.0>}, status: 2}}
(elixir) lib/string/chars.ex:3: String.Chars.impl_for!/1
(elixir) lib/string/chars.ex:17: String.Chars.to_string/1
(utopia) lib/utopia/react_io.ex:2: Utopia.ReactIO.json_call!/2
(utopia) web/controllers/page_controller.ex:12: Utopia.PageController.index/2
(utopia) web/controllers/page_controller.ex:1: Utopia.PageController.action/2
(utopia) web/controllers/page_controller.ex:1: Utopia.PageController.phoenix_controller_pipeline/2
(utopia) lib/utopia/endpoint.ex:1: Utopia.Endpoint.instrument/4
(utopia) lib/phoenix/router.ex:261: Utopia.Router.dispatch/2
goon got panicked and hence the porcelain. Someone please provide a solution.
Related issues: https://github.com/alco/porcelain/issues/13
EDIT: My page_controller.ex
defmodule Utopia.PageController do
use Utopia.Web, :controller
def index(conn, _params) do
visitors = Utopia.Tracking.Visitors.state
initial_state = %{"visitors" => visitors}
props = %{
"location" => conn.request_path,
"initial_state" => initial_state
}
result = Utopia.ReactIO.json_call!(%{
component: "./priv/static/server/js/utopia.js",
props: props
})
render(conn, "index.html", html: result["html"], props: initial_state)
end
end
I perform an app engine query to get a cursor (wrec), and the code shows the number of records correctly by iterating. But then "for rec in wrec" does not run (no logging.info inside this loop).
There's also a GQL SELECT of the same table, with another cursor (wikiCursor) that jinja2 renders properly. Here's the part that doesn't work:
wrec = Wiki.all().ancestor(wiki_key()).filter('pagename >=', findPage).filter('pagename <', findPage + u'\ufffd').run()
foundRecs = sum(1 for _ in wrec)
logging.info("Class WikiPage: foundRecs is %s", foundRecs)
aFoundRecs = []
if foundRecs > 0:
for rec in wrec:
logging.info("Class WikiPage: value is %s", wrec.pagename)
aFoundRecs.append(rec.pagename)
self.render("permalink.html", userRec=self.userRec, wikipage=pagename,
wikiCursor=wikiCursor, wrec=wrec, foundRecs=foundRecs)
else:
errorText = "Could not find anything for your entry."
self.render("permalink.html", userRec=self.userRec, wikipage=pagename, wikiCursor=wikiCursor, error=errorText)
Here is a portion of the log, showing the first logging.info statement, but not the second:
INFO 2014-09-15 19:35:29,525 main.py:410] Class WikiPage: foundRecs is 3
INFO 2014-09-15 19:35:29,581 module.py:652] default: "POST / HTTP/1.1" 200 3058
Why is the wrec for loop not running?
When you use sum to count, the query already iterates until the end. That is expected behavior that if you try to iterate over it again, it won't work (because it already at the ends)
App Engine does not allow use of DefaultClient, providing the urlfetch service instead. The following minimal example deploys and works pretty much as expected:
package app
import (
"fmt"
"net/http"
"appengine"
"appengine/urlfetch"
"code.google.com/p/goauth2/oauth"
)
func init () {
http.HandleFunc("/", home)
}
func home(w http.ResponseWriter, r *http.Request) {
c := appengine.NewContext(r)
config := &oauth.Config{
ClientId: "<redacted>",
ClientSecret: "<redacted>",
Scope: "email",
AuthURL: "https://www.facebook.com/dialog/oauth",
TokenURL: "https://graph.facebook.com/oauth/access_token",
RedirectURL: "http://example.com/",
}
code := r.FormValue("code")
if code == "" {
http.Redirect(w, r, config.AuthCodeURL("foo"), http.StatusFound)
}
t := &oauth.Transport{Config: config, Transport: &urlfetch.Transport{Context: c}}
tok, _ := t.Exchange(code)
graphResponse, _ := t.Client().Get("https://graph.facebook.com/me")
fmt.Fprintf(w, "<pre>%s<br />%s</pre>", tok, graphResponse)
}
With correct ClientId, ClientSecret and RedirectURL, this produces the following output (edited for brevity):
&{AAADTWGsQ5<snip>kMdjh5VKwZDZD 0001-01-01 00:00:00 +0000 UTC}
&{200 OK %!s(int=200) HTTP/1.1 %!s(int=1) %!s(int=1)
map[Connection:[keep-alive] Access-Control-Allow-Origin:[*]
<snip>
Content-Type:[text/javascript; charset=UTF-8]
Date:[Wed, 06 Feb 2013 12:06:45 GMT] X-Google-Cache-Control:[remote-fetch]
Cache-Control:[private, no-cache, no-store, must-revalidate] Pragma:[no-cache]
X-Fb-Rev:[729873] Via:[HTTP/1.1 GWA] Expires:[Sat, 01 Jan 2000 00:00:00 GMT]]
%!s(*urlfetch.bodyReader=&{[123 34 105 100 <big snip> 48 48 34 125] false false})
%!s(int64=306) [] %!s(bool=true) map[] %!s(*http.Request=&{GET 0xf840087230
HTTP/1.1 1 1 map[Authorization:[Bearer AAADTWGsQ5NsBAC4yT0x1shZAJAtODOIx0tZCb
TYTjxFC4esEqCjPDi3REMKHBUjZCX4FIKLO1UjMpJxhJZCfGFcOJlFu7UvehkMdjh5VKwZDZD]]
0 [] false graph.facebook.com map[] map[] })}
It certainly seems like I'm consistently getting an *http.Response back, so I would expect to be able to read from the response Body. However, any mention of Body--for example with:
defer graphResponse.Body.Close()
compiles, deploys, but results in the following runtime error:
panic: runtime error: invalid memory address or nil pointer dereference
runtime.panic go/src/pkg/runtime/proc.c:1442
runtime.panicstring go/src/pkg/runtime/runtime.c:128
runtime.sigpanic go/src/pkg/runtime/thread_linux.c:199
app.home app/app.go:33
net/http.HandlerFunc.ServeHTTP go/src/pkg/net/http/server.go:704
net/http.(*ServeMux).ServeHTTP go/src/pkg/net/http/server.go:942
appengine_internal.executeRequestSafely go/src/pkg/appengine_internal/api_prod.go:240
appengine_internal.(*server).HandleRequest go/src/pkg/appengine_internal/api_prod.go:190
reflect.Value.call go/src/pkg/reflect/value.go:526
reflect.Value.Call go/src/pkg/reflect/value.go:334
_ _.go:316
runtime.goexit go/src/pkg/runtime/proc.c:270
What am I missing? Is this because of the use of urlfetch rather than DefaultClient?
Okay, this was of course my own silly fault but I can see how others could fall into the same trap so here's the solution, prompted by Andrew Gerrand and Kyle Lemons in this google-appengine-go topic (thanks guys).
First of all, I wasn't handling requests to favicon.ico. That can be taken care of by following the instructions here and adding a section to app.yaml:
- url: /favicon\.ico
static_files: images/favicon.ico
upload: images/favicon\.ico
This fixed panics on favicon requests, but not panics on requests to '/'. Problem was, I'd assumed that an http.Redirect ends handler execution at that point. It doesn't. What was needed was either a return statement following the redirect, or an else clause:
code := r.FormValue("code")
if code == "" {
http.Redirect(w, r, config.AuthCodeURL("foo"), http.StatusFound)
} else {
t := &oauth.Transport{Config: config, Transport: &urlfetch.Transport{Context: c}}
tok, _ := t.Exchange(code)
fmt.Fprintf(w, "%s", tok.AccessToken)
// ...
}
I don't recommend ignoring the error of course but this deploys and runs as expected, producing a valid token.