search by tags

for the user

adventures into the land of the command line

installing splunk

splunk is a nice big data tool you can use for monitoring and analysis of your computer systems and networks. you can ingest large amounts of machine generated data, and then use splunk’s map reduce search processes to dig thru massive amounts of data and return results really quickly.

a neat example of something you could do with splunk is to ingest your apache access_logs. then using the iplookup and the maxmind db, create a dashboard which draws on a map of the world, the geographic locations of where all the requests to your web server is coming from over time.

you can ingest a relatively small amount of data free per day, but for larger amounts, you need to purchase a license.

this post is about installing splunk locally and creating a search index


installing splunk

download an rpm from a link like this

wget http://download.splunk.com/products/splunk/releases/6.2.3/splunk/linux/splunk-6.0.1-189883-linux-2.6-x86_64.rpm

to install the Splunk RPM in the default directory /opt/splunk

$ rpm -i splunk-6.0.1-189883-linux-2.6-x86_64.rpm

or with a tarball

$ tar -xzvf splunk-6.0.1-189883-linux-2.6-x86_64.tgz -C /opt/

to start splunk

$ $SPLUNK_HOME/bin/splunk start --accept-license

where $SPLUNK_HOME is the place you installed splunk, /opt/splunk by default

you can start and stop individual splunk processes by adding the process as an object to the start command. the objects include

splunkd, the splunk server daemon. splunkweb, splunk’s Web interface process. for example, to start only splunkd

$ $SPLUNK_HOME/bin/splunk start splunkd

to disable splunkweb:

$ $SPLUNK_HOME/bin/splunk disable webserver

launch splunk web

navigate to: http://mysplunkhost:8000

you can change the port number, but by default it is 8000. the first time you log in to splunk, the default login details are

Username - admin
Password - changeme

splunk will prompt you to change them to something else

get splunk to start up whenever you reboot

as root, run

$ $SPLUNK_HOME/bin/splunk enable boot-start

if you don’t start splunk as root, you can pass in the -user parameter to specify which user to start splunk as. for example, if splunk runs as the user bob, then as root you would run

$ $SPLUNK_HOME/bin/splunk enable boot-start -user bob

if you want to stop splunk from running at system startup time, run

$ $SPLUNK_HOME/bin/splunk disable boot-start

change the default splunk server name

the splunk server name setting controls both the name displayed within splunk web and the name sent to other splunk servers in a distributed setting.

the default name is taken from either the DNS or IP address of the splunk server host.

to change the server name via the CLI, use the set servername command.

$SPLUNK_HOME/bin/splunk set servername somename

changing the datastore location

the datastore is the top-level directory where the splunk server stores all indexed data. to change the datastore directory via the CLI, use the set datastore-dir command

$ mkdir /var/splunk
$ $SPLUNK_HOME/bin/splunk set datastore-dir /var/splunk/

set minimum free disk space

The minimum free disk space setting controls how low disk space in the datastore location can fall before Splunk stops indexing. Splunk resumes indexing when more space becomes available.

To change the minimum free space value via the CLI, use the set minfreemb command.

$ $SPLUNK_HOME/bin/splunk set minfreemb 100

add data to the default index

in the GUI, select the large “Add Data” button on the right side of the screen.
choose some data source, for instance a file or directory. splunk will let you browse the filesystem on the host for files you’d like to ingest. splunk’s user’s permissions are important here.
select a data source, for instance: /var/log/httpd/access_log.
choose how you’d like splunk to format the data for human readable purposes.
tell splunk you’d like to continuously index this data source by clicking that radio button, and choosing the index you’d like the data to go into. you can just go with the default index for simplicity.

and that’s it, to add more data, just repeat.

you can create your own index by clicking ‘settings -> index -> new’ and specifying some configuration settings for the index, but you don’t have to do this if you’re just trying it out.

delete data from an index

$ $SPLUNK_HOME/bin/splunk clean eventdata -index myindexname