Graylog doesn't start after upgrading to version 4 - Advice on fixing this

So after all the Log4J stuff going on, I quickly upgraded to the latest version of Graylog to help mitigate any issues.
I have a server setup with Ubuntu 20.04 using the graylog repository, so upgrading was a simple case of running apt-get update && apt-get upgrade.

I then ran into an odd issue where Graylog wouldn’t start. Web interface was doen and when I checked the graylog-server service it kept saying exited. When I looked at the graylog logs (/var/log/graylog-server/server.log) I noticed it kept saying about unable to connect to any elastic search instances.

As part of the upgrade, it had migrated me up to elasticsearch 7.10
When checking the logs of elasticsearch (/var/log/elasticsearch/graylog.log) I could see if was moaning about http_port was incorrect and did I mean http.port
Now i’m not sure if that happened as part of the upgrade or if it was a depreciated setting and was replaced by http.port later on, but I thought I would save someone else the hassle of tracking this down by posting this up here.
If you run into this issue, then just check (/etc/elasticsearch/elasticsearch.yml) and make sure you have http.port instead of http_port

Also when you make this change issue
systemctl start elasticsearch
followed by
tail -f /var/log/elasticsearch/graylog.log
Watch that because it needs to do an update_mapping for all your old log messages. Until this is done it’s likely you won’t be able to search any of your old logs.

Hope this helps someone else who runs into this issue.

1 Like

I too have this problem. I have not had time to debug the cause(s) yet. However, until you posted I had assumed the problem started when I made the rookie mistake and included the Graylog enterprise packages by accident. I just cut-n-pasted the Graylog install steps without paying attention and reading the notes closely - I do not need the enterprise optional packages.

I have bookmarked your post in case you figure it out and have time to post your findings. If I figure it out when I get back to my offices in the new year I too will post what I found out.

If you tail the graylog server logs and its saying about not finding any reachable elastic search instances then i suspect you have the same problem and my advice in my post shows how to fix this. I posted it up as a how to since i didnt find anything out there already which explains it and graylogs own documentation didnt really explain this issue either, so i reckon its a weird edge case.

You ahould also be able to uninstall the enterprise apps fairly easily too using the apt-get command. If you run into any issues then post them up here and im sure either myself or someone will be able to offer a solution :slight_smile:

I think my problem is different; the /var/log/graylog-server/server.log file shows me this error (over and over):

2022-01-07T15:36:57.880Z ERROR [CmdLineTool] Invalid configuration
com.github.joschi.jadconfig.ValidationException: Parameter password_secret should not be blank
        at com.github.joschi.jadconfig.validators.StringNotBlankValidator.validate( ~[graylog.jar:?]
        at com.github.joschi.jadconfig.validators.StringNotBlankValidator.validate( ~[graylog.jar:?]
        at com.github.joschi.jadconfig.JadConfig.validateParameter( ~[graylog.jar:?]
        at com.github.joschi.jadconfig.JadConfig.processClassFields( ~[graylog.jar:?]
        at com.github.joschi.jadconfig.JadConfig.process( ~[graylog.jar:?]
        at org.graylog2.bootstrap.CmdLineTool.processConfiguration( [graylog.jar:?]
        at [graylog.jar:?]
        at org.graylog2.bootstrap.Main.main( [graylog.jar:?]

So I will figure out what that is about, and post an update here.

So… my /etc/graylog/server/server.conf file has a blank password_secret, which seems pretty straight forward.

I had three problems all caused by the same issue - my /etc/graylog/server/server.conf file was reset. Very irritating!

So once I had set password_secret (fortunately I had made a note of it), root_password_sha2 values, and also set the http_bind_address to the IPv4 address of the graylog host I can log in again.

Weirdly the graylog webapp admin password I had set (which Lastpass sets) is not correct. I have also tried the default admin password, which is admin, and that does not work either. That would suggest the password_secret is wrong but I can log in with another username and password ok.