Half done – double pain
I am checking some alerts, they should verify some complicated conditions, to test some values decided by a customer. The report names are funcy and do not tell all the story, all the checks that are done. The alert name cannot be changed so its name diverged from what is checked, even for which customer it is set up. Some of the conditions are not even getting data.
Frustration
One colleague at some point has really done a shity job, what should I do now, go to see him and complain? Redo the job for him? Listen to the fact he was in a rush?
The problems with half done stuff, because you are in a rush, are always the sames.
they are half done – so they are not good – you cannot trust them.
If all seems good, you always have a douth, is it really true?
If something is wrong and then was good, it makes you loose time.
So the confidence is lost, and you are sure you will pay again with your time for something not finished.
If still this is something only for yourself, you may maybe leave with it for a while, only you or your team knows.
Much worse if it is somebody outside your team that notices it, your stuff start becoming suspicios and people loose confidence, not just on that part but ultimately on all the stuff your team is doing.
So please, if it is half-done, it is just not-done, no bullshit: the sooner you admit is not done the better it is.
Java constant pool
From time to time I still read the java journal, sometimes the articles you can find are very esotheric.
java class file constant pool, this article explains that class files contains a table with some constants. This is quite normal, I remember is the same in windows executables or in linux binaries, there is always a table containing string constants numeric constants, function names etc.
I never imagined that it was so structured in java, that you have constants referencing classes and methods and these constants are used in the byte code so extensively. So if you have 15 minutes to spend on the articile, you will get some real exoteric knowledge.
IntelliJ editor shortcuts
As lame user I am only using menus and I am using a lot the mouse do to everything.
One good friend has sent me this link, a presentation from a java champion about productivity with intellij. I never imagined there existed such kind of features in an editor, and now I undestand finally why an idee can get so big.
Have a look to https://www.youtube.com/watch?v=cK19rE2V9UY , it will help you a lot
Intellij and multiple projects
I am new to Intellij, I started using it since few days. I see you can start quickly working on a new project and you can have multiple windows for multiple projects. So far so good.
Usually I have one big maven project composed of many submodules, so I was quite happy with this one project one window approach.
Now I changed team and the approach is different here, different repos with different projects, and one references the other. Having one window one project was not nice anymore. I started looking for something similar to eclipse tree.
It is actually easy to do that. You just need to create one empty project. Then you create a new module from existing sources, for each of the project you want to import. And that’s all, you have many projects in the same window.
But this is not fitting to what I need. I realized that clicking on one class brings to decompiled sources and not to the real sources contained in another module… I definitively need to take more time exploring Intellij, and getting used to how it works.
Another day spent reading sources
Yep, you cannot easily keep all the sources in a single repository, and when you have many many of them it is easy to miss the one you need. Repos are usually organized in a pyramidal way, the main project and then inside many repos managed by the team. It is like they are tagged by team and not by purpose.
The pyramidal organization is easy to achieve, has the nice property that each element has just one parent, but you may have multiple concerns. You may want to group toghether project with labels. These repos are for deployment, these repos are for scheduling, these repos are for that project… the concerns may overlap one with another.
This make me think at the switch from tree organized mail into gmail organization where you have labels: at the beginning it is strange but in the end it simplifies the life.
This is definitively a whish I make to Atlassian guys, add labels to repositories, so that you can focus on related things, please do it.
Till then still relaying on the fact similar repo starts a bit with the same way, SPbla SPbli…
Yet another way of doing unit testing
scalatest
easy to read article, you test(“my name”) and define a code block: https://www.scalatest.org/getting_started_with_fun_suite
a bit boring to learn another way of doing test but at least you do not need to define the useless function signature etc.
This seem a more complete description of FunTest, which is function test and not jus Fun; let’s home it is also a bit fun…
kerberos login
It finally happened, after years earing from time to time about kerberos I had to use it, and at least for switching users was super stupid
kinit -k -t /path/tokeys/repo/temp.k who@DOMAIN
ksu newuser -e /bin/bash
then you have to know where the keys file is and somebody has to have done the set-up already, but this is another story
Learning Spark 2nd edition
I was searching for a book on spark on O’Reilly site, I have found this one. Luckily the PDF is available online and you do not need to pay for it https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
port forwarding and ubuntu firewall for hadoop
I still needed to use firefox inside the vm to reach the uis in hadoop.
the problem was the ubuntu firewall
# ufw allow 50075
Rule added
# ufw allow 18080
Rule added
# ufw allow 50070
Rule added
# ufw allow 8042
Rule added
# ufw allow 8088
Rule added
# ufw allow 50090
Rule added
# ufw allow 4040
Rule added
hadoop and spark on ubuntu
Hi, I will start working on a big data project, I am setting up my environment.
As usual the versions to use are quite old, once you have a project running it is difficult to make the upgrades to the latest versions.
I am quite old, I did not think at using docker images to set up things, when I realized that I tried to find some hadoop images. Today with a quick google search, I did not find official images so I kept my environment set-up,
So an ubuntu server, that runs without ui
sudo systemctl set-default multi-user.target
this does the magic, then I set up the DISPLAY environment variable to have mobaxterm serve X11. In this way I can use intellij from the box but mixed with windows applications, The same for gedit etc.
Then I installed the glorious spark-2.1.0-bin-hadoop2.7 and hadoop-2.0.7. the set-up has taken a lot of time, luckily there are a lot of guides to do thing step by step.
This one is very nice https://phoenixnap.com/kb/install-hadoop-ubuntu
For spark I had to set up some env var concerning logs location, it has taken a while
Now the spark console spark-shell works and I can play a bit
there are a lot of useful ports to monitor the processes
Useful ports for hadoop and spark
NameNode: fs.defaultFS is hdfs://localhost:9000
namenode http://localhost:50070/dfshealth.html#tab-overview
secondary namenode http://0.0.0.0:50090
data node /0.0.0.0:50075
yarn resource manager port 8088
yarn Node manager http://localhost:8042/node
HistoryServer http://127.0.0.1:18080
spark shell http://127.0.0.1:4040