I'm just curious, is there any other languages besides Java that uses checked exceptions ?
I did try to find information about this but couldn't find any answers.
The reason you couldn't find information on any other languages using checked exceptions is they learned from Java's mistake.
EDIT: So to clarify a little more checked exceptions were entirely a Java thing that in theory sounded like a really great idea, but in practice actually create a tight coupling between the consuming function and the function being consumed. It also makes it more tedious to handle an exception where it can be handled. Instead you have to catch and re-throw in every function in between where the exception was thrown and where it can actually be handled. I could rewrite it all here but I think this post does a magnificent job of explaining why checked exceptions are really not a good idea.
https://blog.philipphauer.de/checked-exceptions-are-evil/
Checked exceptions are not a common feature in mainstream languages due to bad experiences with them in Java. However, Java is not the only language that implements them, and just because Java's implementation was faulty does not mean that they are a bad feature in general.
Nim has checked exceptions.
Checked exceptions have been implemented as a Purescript library.
Checked exceptions can be implemented in Koka by making a custom defined effect for it.
Some common issues with checked exceptions in Java can be handled with better design.
Propogation of "throws" clauses in type signatures leading to potentially a lot of refactoring due to having to update method signatures can be solved via complete type inference. Haskell has a nice way of solving this for type classes (which propogate in the same way checked exceptions do in type signatures -- and can also be used to implement typed exceptions) with partial type signatures -- essentially leaving arbitrary parts of the type "blank" for the compiler to infer.
Issues with higher order functions/lambdas can be resolved via polymorphism/generics. In languages that implement checked exceptions via an effect system (like Koka) -- effect polymorphism is a particularly nice way to solve the problem.
Haskell, Koka, Purescript, and Nim are all highly functional languages that often make use of lambdas and higher order functions, and they don't have Java's issues with checked exceptions.
Related
I am trying to figure out the best practices to deal with poison messages / unhandled exceptions with Apache Flink. We have a Job doing real time event processing of location data from IoT devices. There are two potential scenarios where this can arise:
Data is bad in some way - e.g. invalid value
Data triggers a bug due to some edge case we have not anticipated.
Currently, all my data processing stops because of just one message.
I've seen two suggestions:
Catch the exceptions - this requires me wrapping every piece of logic with something to catch every runtime exception
Use side outputs as a kind of DLQ - from what I can tell this seems to be a variation on #1 where I have to catch all the exceptions and send them to the side output.
Is there really no way to do this other than wrap every piece of logic with exception handling? Is there no generic way to catch exceptions and not have processing continue?
I think the idea is not to catch all kinds of exceptions and send them elsewhere, but rather to have well-tested and functioning code and use dead letters only for invalid inputs.
So a typical pipeline would be
source => validate => ... => sink
\=> dead letter queue
As soon as your record passes your validate operator, you want all errors to bubble up, as any error in these operators may result in corrupted aggregates and data that - once written - cannot be reverted easily.
The validate step would work with any of the two approaches that you outlined. Typically, side-outputs have better semantics, but you may end up with more code.
Now you may have a service with high SLAs and actually want it to produce output even if it is corrupted just to produce data. Or you have simple transformation pipeline, where you'd miss some events but keep the majority (and downstream can deal with incomplete data). Then you are right that you need to wrap the code of all operators with try-catch. However, you'd typically still would only do it for the fragile operators and not for all of them. Trivial operators should be tested and then trusted to work. Further, you'd usually only catch specific kinds of exceptions to limit the scope to the kind of expected exceptions that can happen.
You might wonder why Flink doesn't have it incorporated as a default pattern. There are two reasons as far as I can see:
If Flink silently ignores any kind of exception and sends an extra message to a secondary sink, how can Flink ensure that the throwing operator is in a sane state afterwards? How can it avoid any kind of leaks that may happen because cleanup code is not executed?
It's more common in Java to let the developers explicitly reason about exceptions and exception handling. It's also not straight-forward to see what the requirements are: Do you want to have the input only? Do you also want to store the exception? What about the operator state that may have influenced the outcome? Should Flink still fail when too many errors have been received in a given time window? It quickly becomes a huge feature for something that should not happen at all in an ideal world where high quality data is ingested and properly processed.
So while it looks easy for your case because you exactly know which kinds of information you want to store, it's not easy to have a solution for all purposes, especially since the extra code that a user has to write is tiny compared to the generic solution.
What you could do is to extract most of the complicated logic things into a single ProcessFunction and use side-outputs as you have outlined. Since it's a central piece, you'd only need to write the side-output function once. If it's done multiple times, you could extract a helper function where you pass your actual code as a RunnableWithException lambda which hides all the side-output logic. Make sure you use plenty of finally blocks to ensure a sane state.
I'd also add quite a few IT cases and use mutation testing to harden your pipeline quicker. If you keep your test data inline, the mutants may also exactly simulate your unexpected data issues, such that your validate operator gets more complete.
I'm considering extending the MultiMap methods in Dapper to support more than 5 types. Was just curious as to whether there was a technical/performance reason for 5 or was it just an arbitrary number?
It was fairly arbitrary, and due in part to some implementation particulars that make it pretty awkward to extend arbitrarily - in particular because it uses generics. Changing to an implementation that doesn't use generics would allow a more type-array based approach, but then the lambdas etc (to stitch the data back together) become pretty ugly. There are, IIRC, some pending things in the pull request queue relating to this, but I have not had much available time to review them as of yet.
Also: arguably, if you're doing a query that involves that many types, you're probably already doing something pretty complex; it is hard to expose a friendly API for arbitrarily complex systems.
Just wanted to make you aware that more types have already been supported. (Just helping you NOT reinvent the wheel)
https://code.google.com/p/dapper-dot-net/issues/detail?id=50
At the bottom of the page you can get a git-hub change.
Matt
I am reading The Pragmatic Programmer: From Journeyman to Master by Andrew Hunt, David Thomas. When I was reading about a term called orthogonality I was thinking that I am getting it right. I was understanding it very well. However, at the end of the chapter a few questions were asked to measure the level of understanding of the subject. While I was trying to answer those questions to myself I realized that I haven't understood it perfectly. So to clarify my understandings I am asking those questions here.
C++ supports multiple inheritance, and Java allows a class to
implement multiple interfaces. What impact does using these facilities
have on orthogonality? Is there a difference in impact between using multiple
inheritance and multiple interfaces?
There are actually three questions bundled up here: (1) What is the impact of supporting multiple inheritance on orthogonality? (2) What is the impact of implementing multiple interfaces on orthogonality? (3) What is the difference between the two sorts of impact?
Firstly, let us get to grips with orthogonality. In The Art of Unix Programming, Eric Raymond explains that "In a purely orthogonal design, operations do not have side effects; each action (whether it's an API call, a macro invocation, or a language operation) changes just one thing without affecting others. There is one and only one way to change each property of whatever system you are controlling."
So, now look at question (1). C++ supports multiple inheritance, so a class in C++ could inherit from two classes that have the same operation but with two different effects. This has the potential to be non-orthogonal, but C++ requires you to state explicitly which parent class has the feature to be invoked. This will limit the operation to only one effect, so orthogonality is maintained. See Multiple inheritance.
And question (2). Java does not allow multiple inheritance. A class can only derive from one base class. Interfaces are used to encode similarities which the classes of various types share, but do not necessarily constitute a class relationship. Java classes can implement multiple interfaces but there is only one class doing the implementation, so there should only be one effect when a method is invoked. Even if a class implements two interfaces which both have a method with the same name and signature, it will implement both methods simultaneously, so there should only be one effect. See Java interface.
And finally question (3). The difference is that C++ and Java maintain orthogonality by different mechanisms: C++ by demanding the the parent is explicitly specified, so there will be no ambiguity in the effect; and Java by implementing similar methods simultaneously so there is only one effect.
Irrespective of any number of interfaces/ classes you extend there will be only one implementation inside that class. Lets say your class is X.
Now orthogonality says - one change should affect only one module.
If you change your implementation of one interface in class X - will it affect other modules/classes using your class X ? Answer is no - because the other modules/classes are coding by interface not implementation.
Hence orthogonality is maintained.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 12 years ago.
Any programming language that does not have a suitable reflection mechanism I find seriously debilitating for rapidly changing problems.
It seems with certain languages its incredible hard or not possible to do:
Convention over Configuration
Automatic Databinding
AOP / Meta programming
with out reflection.
Some example languages that do not have some sort of programmatic reflection are:
C, C++, Haskell, OCaml. I'm sure there are plenty more.
To show you can example of DRY (Don't Repeat Yourself) being violated by most of these languages is when you have to write Unit Tests. You almost always need to register your test cases in these languages outside of where you define the test.
How do programmers of these languages mitigate this problem?
EDIT: Common languages that do have reflection for those that do not know are: C#, Java, Python, Ruby, and my personal favorite F# and Scala.
EDIT: The two common approaches it seems are code instrumentation and code generation. However I have never seen instrumentation for C.
Instead of just voting to close, could some one please comment on why this should be closed and I'll delete the post.
You don't.
But you can keep the repetitions close to each other so when changing something, you see something else has to be changed too.
For example, I wrote a JSON-Parser that outputs objects, a typical call looks like this:
struct SomeStruct
{
int a;
int b;
double c;
typedef int serializable;
template<class SerializerT> void serialize(SerializerT& s)
{
s("a",a)("b",b)("c",c);
}
};
Sure, when you add a field, you have to add another field in the function, but maybe you don't want to serialize that field (something you'd have to handle in languages with reflection, too), and if you delete a field without removing it from the function, the compiler will complain.
I think it's a matter of degree. Reflection is just one very powerful method of avoiding repetition.
Any time you generalize a function from a specific case you are using DRY principle, the more general you make it the more DRY it is. Just because some languages don't get you where you get with reflection doesn't mean there aren't DRY ways of programming with them. They may not be as DRY, but that doesn't mean they don't have their own unique advantages which in total sum may outweigh the advantages of using a language that has reflection. (For example, speed consequences from heavy use of reflection could be a consideration.)
Also, one method of getting something like the DRY benefits of reflection with a language that doesn't support it is by using a good code-generation tool. In that case you modify the code for different cases once, in the code generation template, and the template pushes it out to different instances in code. (I'm not saying whether or not using code generation is a good thing, but with a good "active" generator it is certainly one way of getting something like the DRY benefit of reflection in a language that doesn't have reflection. And the benefits of code generation go beyond this simple benefit. I'm thinking of something like CodeSmith, although there are many others: http://www.codesmithtools.com/ )
Abstractly, do more at runtime, without the benefits of things like compile-time type checking (you have to essentially write your own type-checking routines) and beautiful code. E.g., use a table instead of a class. (But if you did this, why not use a dynamically-typed language instead?) This is often bad. I do not recommend this.
In C++, generic programming techniques allow you to programmatically include members of a class (is that what you want to do?) via inheritance.
One nice example for C++ unit testing is cxxtest:
http://cxxtest.tigris.org/. It uses convention and a python script to generate your C++ test suite by post-processing your C++ with python.
A good way to think about getting around restrictions in languages is Michael Feathers' notion of "seams". A seam is a place where your program can be changed without changing the code. For example, in C the pre-processor and linker provide seams. In C++ polymorphism is another place. In more dynamic languages like where you can change method definitions, or reflect, you get even more flexibility. Without the seams things can be more complicated and sometimes you just don't want to try to hammer a nail with your shoe but rather go with the flow of the tool at hand.
I am rewriting code to handle some embedded communications and right now the protocol handling is implemented in a While loop with a large case/switch statement. This method seems a little unwieldy. What are the most commonly used flow control methods for implementing communication protocols?
It sounds like the "while + switch/case" is a statemachine implementation. I believe that a well thought out statemachine is often the easiest and most readable way to implement a protocol.
When it comes to statemachines, breaking some of the traditional programming rules comes with the territory. Rules like "every function should be less than 25 lines" just don't work. One might even argue that statemachines are GOTOs in disguise.
For cases where you key off of a field in a protocol header to direct you to the next stage of processing for that protocol, arrays of function pointers can be used. You use the value from the protocol header to index into the array and call the function for that protocol.
You must handle all possible values in this array, even those which are not valid. Eventually you will get a packet containing the invalid value, either because someone is trying an attack or because a future rev of the protocol adds new values.
If it is all one protocol being handled then a switch/case statement may be your best bet. However you should break all the individual message handlers into their own functions.
If your switch statement contains any code to actually handle the messages than you would be better off breaking them out.
If it is handling multiple similar protocols you could create a class to handle each one based off the same abstract class and when the connection comes in you could determine which protocol it is and create an instance of the appropriate handler class to decode and handle the communications.
I would think this depends largely on the language you are using, and what sort of data set objects you have available to you.
In python, for example, you could create a Dictionary object of all the different handling statements, and just iterate through that to find the right method/function to call.
Case/Switch statements aren't bad things, but if they get huge(like they can with massive amounts of protocol handlers) then they can become unwieldy to work with.