Can I send an alert when a message is published to a pubsub topic? - google-cloud-pubsub

We are using pubsub & a cloud function to process a stream of incoming data. I am setting up a dead letter topic to handle cases where a message cannot be processed, as described at Cloud Pub/Sub > Guides > Handling message failures.
I've configured a subscription on the dead-letter topic to retain messages for 7 days, we're doing this using terraform:
resource "google_pubsub_subscription" "dead_letter_monitoring" {
project = var.project_id
name = "var.dead_letter_sub_name
topic = google_pubsub_topic.dead_letter.name
expiration_policy { ttl = "" }
message_retention_duration = "604800s" # 7 days
retain_acked_messages = true
ack_deadline_seconds = 600
}
We've tested our cloud function robustly and consequently our expectation is that messages will appear on this dead-letter topic very very rarely, perhaps never. Nevertheless we're putting it in place just to make sure that we catch any anomalies.
Given the rarity of which we expect messages to appear on the dead-letter-topic we need to set up an alert to send an email when such a message appears. Is it possible to do this? I've taken a look through the alerts one can create at https://console.cloud.google.com/monitoring/alerting/policies/create however I didn't see anything that could accomplish this.
I know that I could write a cloud function to consume a message from the subscription and act upon it accordingly however I'd rather not have to do that, a monitoring alert feels like a much more elegant way of achieving this.
is this possible?

Yes, you can use Cloud Monitoring for that. Create a new policy and perform that configuration
Select PubSub Topic and Published message. Observe the value every minute and count them (aligner in the advanced option). Now, in the config, when it's above 0 from the most recent value, the alert is raised.
To filter on only your topic you can add a filter by topic_id on your topic name.
Then, configure your alert to send an email. It should work!

Related

How do I get the last message sent by a specific user?

I'm currently running discord.js v13 and I'm trying to find a way of finding the last message sent by a specific user. So far, I've come up with the idea of iterating on an array of user IDs, then iterating on each channel and comparing each timestamp with the pervious timestamp found until it finds the latest one and stores it. However, this doesn't seem to work and I just keep getting today's date for ALL the users even though the majority of them haven't sent a message today.
My Code:
for(var i = 0; i < array_of_users.length; i++) {
for (const channel of message.guild.channels.cache.values()) {
if (channel.isText()) {
let messages = await channel.messages.fetch().catch(e => console.log(e));
messages.each((msg) => {
if (msg.author.id === array_of_users[i]) {
if(x == 0) {
old_date = new Date(msg.createdTimestamp).toDateString();
snowflakes.push(old_date);
x += 1;
} else {
var new_date = new Date(msg.createdTimestamp).toDateString();
let splitOld = old_date.split(" ");
let splitNew = new_date.split(" ");
if(parseInt(splitNew[3]) >= parseInt(splitOld[3])) {
year = splitNew[3];
if(getMonthInt(splitNew[1]) >= getMonthInt(splitOld[1]) ) {
monthC = splitNew[1];
if(getDayInt(splitNew[2]) >= getDayInt(splitOld[2])) {
dayNum = splitNew[2];
} else {
dayNum = splitOld[2];
}
} else {
monthC = splitOld[1];
dayNum = splitOld[2];
}
} else {
year = splitOld[3];
dayNum = splitOld[2];
monthC = splitOld[1];
}
}
}
})
}
}
console.log(`${monthC}-${dayNum}-${year}`);
}
The code above I've written to sort the dates and get the latest one, However something is wrong that it keeps getting the latest date and sometimes even future dates that haven't occurred yet. I can't seem to find the issue but I am speculating that it is originating around the if (msg.author.id === array_of_users[i]) area.
I'm trying to find a way of finding the last message sent by a specific user.
Discord.JS provides no 'built in' way of doing this. The best you can do is by grabbing their messages and comparing each of them to one another until you've exhausted your options and are left with the optimal result. So your approach is (more or less) correct, although your implementation does have some function errors that are throwing you off.
So far, I've come up with the idea of iterating on an array of user IDs, then iterating on each channel and comparing each timestamp with the pervious timestamp found until it finds the latest one and stores it.
If you only are only looking for the most recent message for a specific User, then it is redundant to be dealing with an Array of Users altogether.
However, this doesn't seem to work and I just keep getting today's date for ALL the users even though the majority of them haven't sent a message today.
There are a number of errors in your approach which could be causing this. Here's a quick few of the more generalized / lack-of-experience related errors in your solution for you to work on:
The else block where you're comparing days split up from timestamps is super convoluted. Why not just use the createdTimestamp value from the Message instance itself, then reconstruct the whole monthC-dayNum-year thing again later on using that if you need to?
You're likely going to be hitting Discord's Rate limits (rather frequently) with your script as-is. To put it in perspective, you're basically spamming requests to their API for the next X messages in channels [A, B, ...ZA, ZB] repeatedly. You can optimize this a lot by making logical inferences as to when you can safely 'exclude' a Channel from your search (e.g. if the most recent message you've found in channel X was created on Jun 18 2022, and most recent message found in channel Y was created on Mar 13 2022, you can safely avoid asking Discord for more Messages for channel Y, as it won't have any information relevant to your use case). Writing some additional logic here will pay dividends to you both in how quickly your function can execute, and how frustrating it is to debug once Discord gets upset with you for your small-scale DDOS attack against them.
If I was to solve this problem, the way I would do it would be to employ the following logic:
Given GuildMember X, grab all the Channels in the Guild they belong to.
Filter those Channels so you're only working on TextChannels.
Recursively iterate through those Channels, keeping track of the 'latest message' found belonging to GuildMember X in that Channel (when/if found). Whenever you find a new message belonging to GuildMember X in Channel Y, check if that beats the most recent message found across the entire set thus far. If so, update the best message for the set accordingly. Continually filter the Channels you're working with as you go to exclude Channels that logically cannot hold anything relevant to your search-- any time you finish a recursion, you can go back over the set of Channels and look for any one where the 'best message' in that channel does not beat the current best message. Since you're fetching 'backwards in time' (asking Discord for progressively older and older messages as you continue to recurse), any time the best message in a given channel does not beat the best message across the entire set, you can be certain that the channel does not hold the droids that you are looking for. You're only going to be fetching even older messages than the earliest one already fetched if you continue to look in that Channel, which makes that Channel irrelevant to you at that point.
Lucky for you, I had some time to kill while travelling today, and for whatever reason trying to find an optimal solution to your problem interested me enough to give it a proper shot. The tips I wrote above should help you understand the code below, and if not, leave a comment and maybe I can help you out. Here's how I would solve your problem, while paying attention to logical exclusion and avoiding doing unnecessary work / API calls.
https://codesandbox.io/s/holy-surf-50vpmf?file=/index.js

Creating a cluster before sending a job to dataproc programmatically

I'm trying to schedule a PySpark Job. I followed the GCP documentation and ended up deploying a little python script to App Engine which does the following :
authenticate using a service account
submit a job to a cluster
The problem is, I need the cluster to be up and running otherwise the job won't be sent (duh !) but I don't want the cluster to always be up and running, especially since my job needs to run once a month.
I wanted to add the creation of a cluster in my python script but the call is asynchronous (it makes an HTTP request) and thus my job is submitted after the cluster creation call but before the cluster is really up and running.
How could I do ?
I'd like something cleaner than just waiting for a few minutes in my script !
Thanks
EDIT : Here's what my code looks like so far :
To launch the job
class EnqueueTaskHandler(webapp2.RequestHandler):
def get(self):
task = taskqueue.add(
url='/run',
target='worker')
self.response.write(
'Task {} enqueued, ETA {}.'.format(task.name, task.eta))
app = webapp2.WSGIApplication([('/launch', EnqueueTaskHandler)], debug=True)
The job
class CronEventHandler(webapp2.RequestHandler):
def create_cluster(self, dataproc, project, zone, region, cluster_name):
zone_uri = 'https://www.googleapis.com/compute/v1/projects/{}/zones/{}'.format(project, zone)
cluster_data = {...}
dataproc.projects().regions().clusters().create(
projectId=project,
region=region,
body=cluster_data).execute()
def wait_for_cluster(self, dataproc, project, region, clustername):
print('Waiting for cluster to run...')
while True:
result = dataproc.projects().regions().clusters().get(
projectId=project,
region=region,
clusterName=clustername).execute()
# Handle exceptions
if result['status']['state'] != 'RUNNING':
time.sleep(60)
else:
return result
def wait_for_job(self, dataproc, project, region, job_id):
print('Waiting for job to finish...')
while True:
result = dataproc.projects().regions().jobs().get(
projectId=project,
region=region,
jobId=job_id).execute()
# Handle exceptions
print(result['status']['state'])
if result['status']['state'] == 'ERROR' or result['status']['state'] == 'DONE':
return result
else:
time.sleep(60)
def submit_job(self, dataproc, project, region, clusterName):
job = {...}
result = dataproc.projects().regions().jobs().submit(projectId=project,region=region,body=job).execute()
return result['reference']['jobId']
def post(self):
dataproc = googleapiclient.discovery.build('dataproc', 'v1')
project = '...'
region = "..."
zone = "..."
clusterName = '...'
self.create_cluster(dataproc, project, zone, region, clusterName)
self.wait_for_cluster(dataproc, project, region, clusterName)
job_id = self.submit_job(dataproc,project,region,clusterName)
self.wait_for_job(dataproc,project,region,job_id)
dataproc.projects().regions().clusters().delete(projectId=project, region=region, clusterName=clusterName).execute()
self.response.write("JOB SENT")
app = webapp2.WSGIApplication([('/run', CronEventHandler)], debug=True)
Everything works until the deletion of the cluster. At this point I get a "DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded." Any idea ?
In addition to general polling either through list or get requests on the Cluster or the Operation returned with the CreateCluster request, for single-use clusters like this you can also consider using the Dataproc Workflows API and possibly its InstantiateInline interface if you don't want to use full-fledged workflow templates; in this API you use a single request to specify cluster settings along with jobs to submit, and the jobs will automatically run as soon as the cluster is ready to take it, after which the cluster will be deleted automatically.
You can use the Google Cloud Dataproc API to create, delete and list clusters.
The list operation can be (repeatedly) performed after create and delete operations to confirm that they completed successfully, since it provides the ClusterStatus of the clusters in the results with the relevant State information:
UNKNOWN The cluster state is unknown.
CREATING The cluster is being created and set up. It is not ready for use.
RUNNING The cluster is currently running and healthy. It is ready for use.
ERROR The cluster encountered an error. It is not ready for use.
DELETING The cluster is being deleted. It cannot be used.
UPDATING The cluster is being updated. It continues to accept and process jobs.
To prevent plain waiting between the (repeated) list invocations (in general not a good thing to do on GAE) you can enqueue delayed tasks in a push task queue (with the relevant context information) allowing you to perform such list operations at a later time. For example, in python, see taskqueue.add():
countdown -- Time in seconds into the future that this task should run or be leased. Defaults to zero. Do not specify this argument if
you specified an eta.
eta -- A datetime.datetime that specifies the absolute earliest time at which the task should run. You cannot specify this argument if
the countdown argument is specified. This argument can be time
zone-aware or time zone-naive, or set to a time in the past. If the
argument is set to None, the default value is now. For pull tasks, no
worker can lease the task before the time indicated by the eta
argument.
If at the task execution time the result indicates the operation of interest is still in progress simply enqueue another such delayed task - effectively polling but without an actual wait/sleep.

Reasons socket.io emit not receiving messages?

I'm working on an angular/node app where people can have many 1:1 chats with other users (like Whatsapp without groups) using socket.io and btford's angular-socket module (https://github.com/btford/angular-socket-io). Right now A) a client joins a socket.io room using emit. The client code is:
mySocket.emit('joinroom', room);
Server code is:
socket.on('joinroom', function (room){
socket.join(room);
});
B) chat messages are sent to server via emit. Client code is
mySocket.emit('sendmsg', data, function(data){
console.log(data);
});
and C) the server should send messages to others in the room via broadcast. Server code is:
socket.on('sendmsg', function (text, room, sender, recipient, timestamp) {
// Some code here to save message to database before broadcasting to other users
console.log('This works');
socket.broadcast.to(room).emit('relaymsg', msg);
});
Client code is
$scope.$on('socket:relaymsg', function(event, data) {
console.log('This only sometimes works');
// do stuff to show that message was received
});
A and B seem to work fine, but C seems to be very unreliable. The server code seems to be ok, but the client does not seem to receive the message. Sometimes it works, and sometimes it does not. ie 'This works' always shows up, but 'This only sometimes works' does not always show up.
1) Any thoughts on what could be causing this issue? Are there any errors in my code?
2) Is broadcast and rooms the right way to be setting this up if there are many users, all of which can have multiple 1:1 chats with other users?
In case it helps, this is the factory code for the angular-socket module
.factory('mySocket', function (socketFactory, server) {
var socket = socketFactory({
ioSocket: io.connect(server)
});
socket.forward('relaymsg');
return socket;
});
Appreciate any help you can provide!! Thanks in advance!
Thanks everyone for the comments, I believe I found the main issues. There were two things I think causing problems:
1) The bigger issue I think is that I'm use node clusters, and as a result users might join rooms on different workers and not be able to communicate with each other. I've ended up adding sticky sessions and Redis per the instructions here: http://socket.io/docs/using-multiple-nodes/
Sticky sessions is pretty useful, just as an FYI since the docs don't mention it, the module automatically creates workers and re-spawns them if killed
I couldn't find a ton of examples of how to implement sticky+redis since socket.io 1.0 is relatively new and seems to deal with Redis differently from prior versions, but these were very helpful:
https://github.com/Automattic/socket.io-redis/issues/31
https://github.com/evilstudios/chat-example-cluster/blob/master/index.js
2) Every time the user closed their phone it would disconnect them from the chat room, even if the chat room was the last screen open on the phone
Hope that helps people in the future!

How to remove default disclaimer in javamail

When sending emails via javamail, the following is always appended to the bottom of each message:
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager. This message contains confidential information and
is intended only for the individual named. If you are not the named
addressee you should not disseminate, distribute or copy this e-mail.
Please notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system. If you
are not the intended recipient you are notified that disclosing,
copying, distributing or taking any action in reliance on the contents
of this information is strictly prohibited.
How does one prevent this?
(NOTE: This problem is extremely frustrating to research on the web due to the fact that a disclaimer of this form is attached to so many indexed documents! :-(
JavaMail is not doing that, it is your outgoing SMTP server appending it to each message, probably set up by IT.
To confirm, you can use gmail's servers (with a personal account) and you will see it does not get added to the messages.
This should work. Pay attention to the form in which email body get parsed. In my case the emailBody string is on one line, so you have to put the "#Your disclaimer Here#" on one line. Answer for who will come in future.
public String deleteDisclaimer(String emailBody) {
String disclaimer = "#Your disclaimer here#";
if (emailBody.contains(disclaimer)) {
System.out.println("Deleting Disclaimer..");
return emailBody.substring(0,emailBody.indexOf(disclaimer));
}
System.out.println("DISCLAIMER NOT FOUND!");
return emailBody;
}

Pidgin: cannot send messages or set topics in chats via dbus

I would like to send messages to Pidgin chats or set chat topics via dbus. Following this guide I was able to write some pretty straightforward code to do just that, and it does indeed result in messages appearing or chat topics being changed... but it only seems to affect my window, without the other participants being aware of any messages or topic changes.
I'm using
purple.PurpleConvChatSetTopic(chat_data, user, topic)
and
purple.PurpleConvChatWrite(chat_data, user, message, flag, time)
I don't think this is due to any misuse of the dbus api as the calls actually result in actions. I just wonder if I need to perform some sort of authentication first? Or maybe the user can only be the current user? I tried with my nick and also setting it as unicode but to no avail.
Here is the complete code anyway:
import dbus
import time
# define chat_name, user, topic, message
bus = dbus.SessionBus()
obj = bus.get_object('im.pidgin.purple.PurpleService', '/im/pidgin/purple/PurpleObject')
purple = dbus.Interface(obj, 'im.pidgin.purple.PurpleInterface')
for p in purple.PurpleGetConversations():
if purple.PurpleConversationGetName(p) == chat_name:
chat = p
chat_data = purple.PurpleConversationGetChatData(chat)
purple.PurpleConvChatSetTopic(chat_data, user, topic)
purple.PurpleConvChatWrite(chat_data, user, message, 0, int(time.time()))

Resources