FailPoints FAQ

 

Back to Top

Welcome to FailPoints

FailPoints is continuously being developed in terms of new features, fine tuning and bug fixing therefore maintaining an FAQ is quite a challenge. Due to this, we provide a ‘Click for help’ option next to most sections in the application and email/phone support is always available.

A few things to know

Date/time and time zone: By default, all agents are configured using UTC date/time. Agents must have the correct time zone set to prevent inaccurate reports. The date/time are converted to the local time of the agent in the dashboard. This allows installing agents in different geographical locations while having the ability to see the correct local date/time of the reports. This should be kept in mind when viewing agent reports which are in different time zones.

Software or hardware agent: We are often asked if the agent is available in any other format. The agent software runs on Windows 7/8/10 and some Linux flavors. The FailPoints service uses the hardware agents exclusively for several reasons. They are dedicated to specific duties, are automatically updated, run 24/7 without any user intervention required. They use less than one amp of power and are inexpensive to replace if needed. They are also easier to support than full operating systems with a varying mix of hardware and software installed. This makes the hardware agent the best option for the FailPoints service where critical services are being monitored.

Outages classification: FailPoints has three different classifications in outages reports. The algorithm will try to determine where Internet connectivity problems occur then show this information in each agents reports. Support personnel should be aware that algorithms can make mistakes depending on the network configuration and especially when multiple levels of private IPs are being used between the customer and the provider and perhaps even their upstream. How the classification algorithm functions is explained further below.

The more agents installed in any one neighborhood, business park, town or city, the better the statistics become by having the ability to correlate multiple locations being affected and their reports.

Feedback
Please never assume that someone else has or will report something with FailPoints needs attention as this is not the case. Most do not report problems and no matter how much testing is done before releasing something, some problems are learned about through complaints. No matter the issue, please take a moment to send feedback so that any problems can be taken care of. With input, problems are usually fixed very quickly. Feedback good or bad is very welcome.

Back to Top

IMPORTANT: Cable based Internet services

Cable provider based Internet services have an additional layer of potential problems as these services are provided using both signal levels and TCP/IP routing. This presents a problem in that FailPoints cannot monitor cable plant signal levels however, it can give hints.

Agents will always report IP network problems but cannot know about signal level issues which may be affecting the local modem or the providers equipment outside and in the area.

Examples of this might be if seeing inactive notifications and speed test results which are outages based but which don’t show any related outages. These problems are typically because the agent was unable to communicate, disconnected in some way while there were no actual IP outages.

If restarting the modem gets everything back online and there are no outages in the FailPoints reports, it may mean the modem or something upstream is experiencing problems at the cable plant level. In such cases, logging into the local modem to check signal levels and history could help confirm this.

Back to Top

How FailPoints helps

FailPoints provides details about IP/Internet connectivity which can be used by IT personnel that need to know what is happening at remote locations. Too often, tools and information offered by the Internet provider may not be the most accurate because they are monitoring their overall network to ensure that everything is up and may not see issues that the local service is experiencing.

Using bandwidth test sites give some idea of bandwidth but even when the results are good, connectivity could continue being slow, sluggish and outages may be occurring.

The only sure way of knowing how services are performing is by conducting non stop, 24/7 testing from each location. FailPoints was created specifically to offer a low cost, low resource usage capability of monitoring services during and especially after problems have occurred. Having FailPoints historical data can be extremely valuable when looking for problems that have already come and gone.

As there exists a large number of services on the Internet which report outages and downed sites and services, the FailPoints service is meant to help network managers closer to the end points. FailPoints is a service that can help organizations that maintain remote customers or any number of their own remote locations.

FailPoints is a responsive and reactive service, sending alerts about potential network problems, environment conditions and unauthorized firewall ports so that issues can be acknowledged as quickly as possible to prevent down time. In many cases, time and money can be saved by preventing truck rolls to remote locations by looking at the overview map and/or agent dashboards.

Some examples;

-Alerts immediately notify the correct people about problems. All alerts must be acknowledged before they can be dismissed. In addition, alerts can only be closed with a note about what happened and what the remedy was. This ensures that all issues are dealt with and a historical page of alerts helps to see any patterns.

-The overview map suddenly shows multiple inactive agents in the same area. This could be indicative of an ISP experiencing problems affecting these locations. The overview map is a visual representation that clearly shows related problems. In this case, the problem is likely not any one location but the cause will be shown once the agents are back online and send their reports.

-Each locations reports can help determine patterns and problems before sending any personnel.

-Having the ability to let those at the remote location know that they had an Internet problem and nothing to do with supported services or resources.

There are countless ways to not only save money looking for problems that came and went and/or no longer exist though the main thing is being able to know about problems very quickly in order to respond before complaints start coming in.

Back to Top

How FailPoints works 

Most problems with Internet connectivity are usually with the ISP or their own upstream provider. In some countries, there can be any number of smaller ISPs before reaching any significant higher tier where any one of those could be experiencing problems. The FailPoints focus is not about monitoring the Internet but more relevant, closer to each location data that can help find problems with LAN and especially with the provider and perhaps their own upstreams. The information generated by FailPoints can be highly useful to not only organizations but to the providers as well. 

The primary objectives are to create a near real-time, low impact method of:

  • Ensuring that remote Internet/IP connectivity and communications are always up.
  • Having the ability to monitor any number of remote locations, consolidating all data into management environment.
  • Alerting the proper personnel as quickly as possible to potential network and/or security problems.
  • Providing secure access to remote equipment rooms and networks without opening firewall ports.
  • Providing backup DNS should upstream services become unavailable or for privacy reasons.
  • Proving simple remote camera usage which could help when planning to send personnel.
  • Adding optional multiple metrics environment sensors to any location.

All packed into a tiny inexpensive device that is on the job within minutes, easily DHCP connected and using very little power. Nothing else to install.

Responsive and reactive
FailPoints is a responsive and reactive solution, sending alerts about potential network problems, environment conditions and unauthorized firewall ports so that issues can be acknowledged as quickly as possible to prevent down time. In many cases, time and money can be saved by preventing truck rolls to remote locations by looking at the overview map and/or agent dashboards.

The concept is easy to understand, the effectiveness is instantly seen and the technology behind FailPoints is constantly evolving. 

Agent/Network communications

The agent monitoring software is one side of the solution while the FailPoints network is the second.
A third part is a combination of both the agent and the network side.
Agents can have up to three separate ways of communicating with the network.
The network can send requests to any number of agents to generate special and custom requests and reports.

If speed testing is enabled, the agent software tries to pick the best time to run a full saturation speed test. The agent monitoring software contains an algorithm which tries to pick up where it left off. When the agent is restarted, it downloads the last averages that it stored in order to have a starting point. From here, the algorithm re-adjusts as data begins accumulating so that it can rebuild its averages.

Averages are built through a baseline speed test if speed testing is enabled, overall ping times and a short non saturation speed test that is occasionally run. As data builds up, the algorithm will recognize if there are any fairly large percentage differences between the averages and the current results which triggers a full saturation test.

Back to Top

FailPoints components

The FailPoints service is composed of several components starting with the overview map.

The overview is not only useful in identifying multi location alerts in a visual manner but also all locations which still need attention. Different colors are used to identify reasons for alerts which remain on the map until acknowledged and dealt with in one way or another. The overview not only helps to see when whole areas are experiencing troubles but also to make sure that there is accountability. The overview also shows recent events for all agents/locations and some basic statistics.

The next major component is agent dashboards. Each agent generates data for its own location and dashboard. Reports include heartbeat, alerts, recent events, outages, average times of outages, pings, speed testing and historical trends.

The ‘At a glance’ section shows details about the agent itself, sensors data if equipped, network settings and services which may be enabled or disabled.
All dashboard sections have a ‘Click for help’ option to get help. Many sections are also covered in this document.

Most settings can be configured in the Configure menu while some can be controlled from the dashboard. External links to remote networks, cameras and DDNS are in the main dashboard for quick access.

Back to Top

Org manager and users

FailPoints is a role based access level service where different accounts can have different levels of access. When an organization starts using FailPoints at least one account becomes the organization manager or org manager and has full control of any options. The manager can also create additional managers and users.

The next level is ‘users’, which are typically personnel that physically visit and maintain various locations where agents are installed. Users cannot override options set by the org manager.

All managers and users have access the the overview map and can access any and all agents owned by the organization. This is to ensure that any user can take care of issues without access level hindrances. 

When a user installs and activates an agent, it becomes available to all users of the organization and the org manager can enable or disable certain options once the agent has become active.

The FailPoints service can only be accessed once an organization has been created by a sales or support person with Echo Networks. Direct sign up to the service is not possible.

Note: If it important to confirm emails are being received so that alerts will make it to the correct personnel.

Back to Top

FailPoints Control Panel (FCP)

The FCP is the starting point for all FailPoints services and features. This is where the  overview map, agents list and agent dashboards can be reached.

Back to Top

Overview map/status

Org managers can view the manage agents list to see the status of all agents and easily find locations needing attention. However, the overview can be considered the visual starting point of the organization as it can be easily monitored by either technical or non technical personnel.

A legend shows alert types and colors which are associated with each. When one or more locations go into an alert state, each is shown on the map. Colored markers help to clearly show if one or multiple locations are showing alerts. Clicking on any marker brings up the details of that agent and clicking on its link brings you to that agents dashboard.

A Recent events section on the right hand side shows a list of the most recent events which have occurred anywhere in the networks being monitored. Users can zoom in and out like any other map application to see locations experiencing problems anywhere in the world.

Note that only locations with alerts are shown in the map to prevent congesting the map with properly functioning agents/locations.

Back to Top

Settings menu – Manager only

The settings menu is where organization managers will find configuration options which only managers can change. As new features are added, any controllable options which belong in this section will be added.

Back to Top

– Users – Manager only

The Manager users menu shows which users have been created into this organization along with their role. Only managers may add/remove/edit users.

Mousing over the right hand icons will show what each option does. The manager is able to edit accounts, assign roles, disable and delete accounts.

To add an additional manager or user, click on Create user then fill in the required fields. Using strong passwords is always advised and making sure that the activation email address is correct ensures that the new manager/user being added will be able to activate their new account. 

Back to Top

Agents menu

The agents menu shows all agents in a list format. Because an organization could have hundreds or thousands of agents, filtering is provided so that managers and users can more find and view specific agents and locations.

For example, filtering to show only inactive or disabled agents or filtering for a specific street in a specific city will display only those agents which meet the desired criteria. 

Right hand Actions
Reset filters: Filtering can be reset to default.
Order hardware agent: Leads to a form for ordering more agents and other options.
Activate hardware agent: Managers and users can activate new agents using this option.

Right hand Actions (mouse over)
Dashboard: Takes the user to this agents dashboard.
Notifications: Takes the user directly to this agents Notifications menu/options.
Reboot: Some agents have a remote reboot option. Use this to reboot any remote agent. 

Back to Top

Dashboard

The dashboard view shows this agents current reports, information and contains sub menus for configuring various functions. 

Back to Top

– Alerts

Users can configure alerts based on various conditions such as an agent no longer communicating, unauthorized firewall ports opening, environment condition limits being reached and so on. Email and SMS settings can be controlled through the notifications menu item for each agent.

Note that emails and SMS are limited to ensure that FailPoints does not spam its members and it is assumed that the user wants to know about issues as quickly as possible and therefore would check after even one notification. The user must reset notifications every so often depending on which are being sent.

Also note that SMS is available only in 1+ dialing areas. If international SMS is required, please contact FailPoints support for help and information.

Back to Top

– Heartbeat

The heartbeat is a near real time display of the agent communicating with the FailPoints network. The heartbeat helps confirm that the agent is communicating properly and gives a visual cue that everything is running as it should. In normal situations, agents automatically restart every twenty four hours to ensure no memory or other operating system problems have occurred. The last restart date/time is shown below the graph along with how long the agent has been running for.

The heartbeat is generated by three separate metrics of communications with the FailPoints network.

Back to Top

– Recent events

Recent events are outages, updates and other communications between the agent and the FailPoints network. Event notices help to see and understand what is going on with agent/network communications and if actions should be taken. The Recent Events section also contains suggested actions such as notifications needing to be reset and so on.

Back to Top

– Outages

The outages graph shows the last 50 outages the connection has experienced along with detailed information about each. When mousing over each bar, accumulated details for that particular outage will be displayed. Older information can be found in the historical menu.

Back to Top

– Outages avg time

As outages build up so does the averages graph. This graph shows when most of the outages are occurring so that over time, trends are built up showing when problems are happening. Older information can be found in the historical menu.

Back to Top

– Pings

Along with other tests, pings are used to establish a pattern and averages. Pings are not based on any nearby point and only to generate averages so that the algorithm can do its job. Older information can be found in the historical menu.

Back to Top

– Speed test graph

Bandwidth testing is an interesting topic which is often misunderstand because it is not solely about bandwidth. A location can have a high bandwidth service yet users may find themselves barely able to reach resources on the Internet.

Commercial speed testing sites saturate the connection but because bandwidth is shared with other customers in each area, this sort of testing is unlikely to show  what is actually going on with the connection.

If large numbers of consumers were speed testing in the same area, Internet access for everyone in that area would slow to a crawl. Speed testing alone is not of much use without additional data and. If the test is being optimized to be as close as possible to your providers edge network, you are in effect testing under the best conditions possible which is not real world testing. Most speed testing destinations are optimized content delivery networks known as CDN’s.

Speed testing to get actual results in shared environments such as consumer grade Internet services is a very difficult problem to solve. Shared Internet bandwidth speeds can and do constantly change and in seconds. By the time a speed test is started and ended, the result is based on that duration only and slowdowns may have already passed.

If speed testing is enabled, the agent algorithm will trigger speed testing based on a variety of fluctuations, trying to test at the best possible moment. This will help to better visualize how speeds (bandwidth) and in fact, throughput are doing on the connection in a way that a human being trying to test at the right moment could not do.

The FailPoints solution tries to show ongoing averages (baseline) and when speeds become lower. The result is a graph which gives a visual representation of how speeds are doing and which tests were conducted. Mousing over the graph will display dates/times and types of tests. Different tests are shown in different colors to help visualize the overall report to more easily compare with outages and pings reports.

Colors and meanings
Various colors help to visualize which tests are baseline and which are triggered based on certain events.

Green – Baseline test. The agent software is running a speed test on a regular basis in order to establish a baseline or average.

Blue – Latency trigger. This test is triggered when the latency of the connection begins to fluctuate outside of the measured averages.
Orange – Slowdown trigger. This test is triggered when short burst speed tests are run and the results show slower than usual speeds.
Black – Outage trigger. This test is run moments after an outage ends to determine if speed is back to normal or if it remained slower than the calculated average before the outage.
Note: Cable modem signal level issues could also trigger this test erroneously.

Back to Top

– IMPORTANT: Speed testing uses data

This feature is experimental, the algorithm continues to be in development.

Speed vs throughput: Internet ‘speed’ is technically bandwidth. Bandwidth is the max amount of data the connection will allow based on the purchased plan.

Throughput is the amount of data this connection can actually move at any given time. Bandwidth and throughput are very different things.

Monthly data plan vs Unlimited plans: Speed testing uses data. If the data plan is large or unlimited, this may not be an issue but if it is a capped data plan, speed testing should be conducted conservatively. FailPoints understands capped plans and tries to optimize this test to make it useful without wasting data.

Disable (default setting)
No speed testing will be done by the agent.

Allow
The agent will run speed tests to determine if bandwidth has fallen below a certain threshold. An algorithm controls this function based on a variety of conditions such as latency and slowdowns. The latter is a short, non saturation based speed test which triggers a full saturation speed test if the results are poor.

Speed Limit (Internal and experimental)
A speed limit may be imposed to conserve bandwidth. The test is trying to determine when bandwidth drops considerably and not what the full speed is. The agents job is to try and report when speeds fall below average or even usefulness which is difficult because it cannot know when something is actually using bandwidth such as watching movies, downloading files, etc. The speed limit is still in development.

Back to Top

– Network stats

This section offers overall statistics about the performance of this Internet service and where most of the problems might be. MOD means ‘most often down’.

% Affected networks – Percentage of Internet problems with LAN, ISP or beyond.
Top MOD hops – Top most problematic hops showing where, LAN, ISP or beyond.
Top MOD orgs – Organizations experiencing the most problems relating to this connection.

Important – All references to ‘Beyond ISP’ are informational only. The most important information is how the ISP is performing. Anything beyond ISP is not only informational but is a test point that FailPoints is using to monitor the performance of the service. In some cases, some of these could have affected services but the main point is to monitor the Internet service provider. Older information can be found in the historical menu.

MOD, meaning ‘Most Often Down’ and in this case related to the hop and organization. A hop is a networking piece of hardware such as a router or modem, then the providers switches, all of which packets must travel across in order to reach Internet sites or services. Each device that data travels across is called a hop.

If any one of these hops prevents data from getting to the next device, the local connection could suffer slow, sluggish or even unreachable services until that device is fixed. In most cases, the cause of such a loss can be attributed to a bad cable, hardware malfunctioning or improperly configured interface/device or of course human error such as a cable being disconnected.

In today’s real time world, such problems can affect VoIP phone calls, live video and other services not to mention constantly getting disconnected from servers and other devices.

The Network stats shows the last 50 outages broken down by Lan, ISP (Internet provider) and Beyond. The top 5 hops will show where most of the hop problems have been and the top 5 orgs will show with which organizations if the problems are beyond the local network. 

Back to Top

– Right hand columns

At a glance, Right hand columns and information

Sensors
If this agent has the optional environment sensors enabled, the first section will show all of the metrics being monitored at this location. The user can monitor conditions where the agent and sensor are installed and can also receive alerts if certain limits have been reached. For example, if temperatures or humidity reach a certain limit, an alert can be sent. A number of sensors can be set to alert by visiting the Configure, Sensors menu for more.

After setting alerts, the user must ensure that notifications are enabled or reset and that valid email/SMS details are provided. FailPoints must be contacted for International SMS.

Links
If the agent is configured with DDNS, camera or remote access for example, convenient links are provided in the dashboard for users to click on without having to memorize urls, IPs and other details. 

Remote controls
If this agent is configured to allow camera or remote access for example, details about each of those settings are conveniently shown to the user. An org manager can override any of these settings to enable, disable and revoke services.

At a glance
Brief view of agent, network and other settings.

ID – Unique agent identifier.
Nickname – Nickname to identify this agent.
Device – Shows operating system or hardware/OS type.
OTM ver – FailPoints OTM version.
Time Zone – Time zone that agent is in. Each agent can have its own time zone.
Activated On – Shows when agent was activated.
Comms – Status of this agent, allowed or denied communications with FailPoints.

Network
My Public IP – The public IP of this site.
LAN IP – The IP for this device.
DNS Used (first) – The DNS server being used for this agent.
DNS Used (second) – The secondary if supplied.

Options
The org manager has full control over allowed options. Controllable options are shown with a link on the right hand side. Once set, users will see Available or Not available along with disabled. Users cannot override these settings and can see which options are available for this agent and location. 

Controllable options include:
Camera
DDNS
Notifications
RAS
Security scan
Sensors
Speed test

Non controllable options include;
Data usage
DNS server

Back to Top

Historical menu

The historical menu is where all long term data can be found. 

Back to Top

– Alerts

Historical details about all alerts for this agent are stored long term.
Current alerts are shown at the top of the page while historical ones which have been closed are below.

All alerts can filtered to find specific events by kinds of alerts, date/time, acknowledged by, when and so on.

This information can be very useful when looking for trends or specific events and therefore cannot be deleted by manager or users. 

Back to Top

– Hops

Hops is in alpha development in order to determine the most efficient and useful way of showing this data.

The agent not only keeps track of hops internally but it regularly sends an updated hops list which is used by the classification algorithm to help determine LAN, ISP and beyond ISP information. 

FailPoints works by source and destination testing and does contain some multi-point testing algorithms. The hops are mainly used to establish a relatively constant destination point which the agent can communicate with.

As mentioned in ‘How FailPoints works, most problems with Internet connectivity are usually with the ISP or their own upstream provider. In some countries, there can be any number of smaller ISPs before reaching any significant higher tier where any one of those could be experiencing problems.

The FailPoints focus is not about monitoring the Internet but more relevant, closer to each location data that can help find problems with LAN and especially with the provider and perhaps their own upstreams. The information generated by FailPoints can be highly useful to not only organizations but to the providers as well.  

At this time, the most current hops list is shown in text and is meant to be informative 

Back to Top

– Outages/Averages

The dashboard shows a certain amount of data in order to consolidate the most recent events in one quick view. Mousing over events gives all of the details available such as how long, where and with whom the event was along with hops and so on. 

In the historical outages and average times of outages feature, users can select a certain date range which gives an overall view of outages over that period of time. When selecting a range such as one week, one month and so on, all outages for that period will be shown along with average outages times for the same period of time.

Back to Top

– Pings

 The dashboard shows four hours of pings to not only keep it from being congested with too much data but also because while pings information is useful, the most recent are the most useful. Mousing over events gives all of the details available such as date/time, max, avg and min times of pings.

Note that each record is an average of one minute. The agent runs regular pings within a one minute period which it then consolidates into one averaged out record. It then sends this to the dashboard to give a quick visual of how things are going. At the same time, the agent is also averaging out all pings which the algorithm uses to determine if there is latency occurring. 

In the historical pings feature, users can select a certain date range which gives an overall view of pings over that period of time. When selecting a range such as one week, one month and so on, all pings for that period will be shown.

Note that unlike outages and other historical data, older pings data is removed from storage regularly to prevent tables from growing with relatively non useful data. 

Back to Top

– Speed tests

The dashboard shows the last 50 speed tests in order to consolidate the most recent events in one quick view. Mousing over events gives all of the details available about the speed test and its results.  

In the historical speed tests menu, users can select a certain date range which gives an overall view of speed test results over that period of time. When selecting a range such as one week, one month and so on, all results for that period will be shown. 

Back to Top

– Stats

The dashboard shows the last 50 stats while the historical menu offers longer ranges of details. The stats are meant to offer overall statistics about the performance of this Internet service and where most of the problems might be. MOD means ‘most often down’. Unlike the dashboard, the historical menu stats offer

% Affected networks – Percentage of Internet problems with LAN, ISP or beyond.
Top MOD hops – Top most problematic hops showing where, LAN, ISP or beyond.
Top MOD orgs – Organizations experiencing the most problems relating to this connection.

Important – All references to ‘Beyond ISP’ are informational only. The most important information is how the ISP is performing. Anything beyond ISP is not only informational but is a test point that FailPoints is using to monitor the performance of the service. In some cases, some of these could have affected services but the main point is to monitor the Internet service provider. Older information can be found in the historical menu.

MOD, meaning ‘Most Often Down’ and in this case related to the hop and organization. A hop is a networking piece of hardware such as a router or modem, then the providers switches, all of which packets must travel across in order to reach Internet sites or services. Each device that data travels across is called a hop.

If any one of these hops prevents data from getting to the next device, the local connection could suffer slow, sluggish or even unreachable services until that device is fixed. In most cases, the cause of such a loss can be attributed to a bad cable, hardware malfunctioning or improperly configured interface/device or of course human error such as a cable being disconnected.

In today’s real time world, such problems can affect VoIP phone calls, live video and other services not to mention constantly getting disconnected from servers and other devices.

The Network stats shows the last 50 outages broken down by Lan, ISP (Internet provider) and Beyond. The top 5 hops will show where most of the hop problems have been and the top 5 orgs will show with which organizations if the problems are beyond the local network. 

Back to Top

Configure (menu)

The Configure menu is where most settings and configuration changes for the agent are made.

Back to Top

– Geo location

Allows changing the location and most importantly the time zone of the agent. Enter the full or partial address then click on ‘Auto Fill Address Details Below’ to ensure the correct location.

Note that while street numbers can be entered to set the map location correctly, this information will never be shown in public. Street addresses are used solely for users to filter agents in various locations.

Back to Top

– Webcam option

Hardware agents ordered from FailPoints come with special features which can be enabled through the dashboard. One of those features is the ability to connect a compatible (UVC) webcam to the USB connector on the device.

IMPORTANT: Note that the hardware agent needs at least one amp of power. When adding a USB device, the device plus the agent total power requirement must be taken into consideration otherwise the agent will not function correctly or could even be damaged.

Determine the power supply by the amount of wattage/amps the camera needs then add one amp. Typically, a 2-3 amp USB power supply should be sufficient.

Camera settings available include the following values.

Enable Camera: Enable or disable the camera service on the hardware device.
The dashboard will show if the device supports this.

Resolution: (Default 1024×768) Pick the most efficient resolution for the camera in pixels. Note, higher resolution means more bandwidth required.
Options are: 160×120 320×240 800×600 1024×768 1280×720 1280×960 1600×1200 1920×1080

Framerate: (Default 5) Pick the framerate that the camera should stream at. Note, higher frame rates mean more bandwidth required.
Options are: 5 10 15 20 25 30

Username: (Default name) Create a secure user name for access. Upper, lower and control characters combination are best.
Password: (Default pass) Create a minimum 8 character password. Upper, lower and control characters combination are best.

When clicking on Save, the agent may be rebooted in order to start the webcam service. Check the dashboard and the Heartbeat section to confirm when the agent is communicating again. Once confirmed to be communicating, connect ythe webcam to the USB port of the agent.

Determine the local IP of the agent by looking at the dashboard then browse over to that IP using port 8080.
ie; http://192.168.1.34:8080/
Note that Internet explorer doesn’t seem to work.

A prompt for username and password should be shown. If not, try disconnecting then reconnecting the camera to have the drivers identify it. The agent may need to be fully powered down then back on with or without the camera connected. After experimenting to find the right combination, the camera should always be seen once it is working with the agent.

To use the webcam as a remote security camera over the Internet, a port forwarding will be required on the local router/firewall. Internet accessible devices should always use strong user name and password combinations.

Experiment with resolution and frame rates to determine what works best for the requirement and the available bandwidth.

Back to Top

– Remote Access Service (RAS)

The RAS feature is available on select FailPoints agents and gives logged in organization users remote access into the network where the agent is installed. This allows the user to securely gain access to routers, firewalls and other configuration pages and devices without having to open port on the main router. 

The connection is a secure encrypted one accessible from the agents dashboard in the Links section when configured. The Remote controls section of the dashboard shows the configuration set by the user.

Click on Configure, Ras to enable and configure the service.

LAN IP: Enter the LAN IP of the device to be reached from remote.

LAN port: Enter the port of the device to be reached from remote. 
Note that due to the user of multiple port forwarding requirements, some experimentation may be required when dealing with port 443 devices. In some cases, using port 80 may be the only way to allow port forwarding yet remains an https connection when using it.

Public IP:  Enter the public IP which will be allowed to access the remote network. In addition to requiring an authorized user being logged into FailPoints, configuring a public IP allows only that only that IP to access the remote network.

After filling in all of the fields and clicking on Save, the process of establishing the secured connection will begin. The status is shown in the agent dashboard under Remote controls. Once everything is up and running, the Links section will show the RAS link and the Remote controls will show the RAS configuration.

The Remote controls section allows the user to close the link by clicking on RAS, Action allowed, Disable.

Back to Top

– Security scan

Enables a security scan by entering the allowed ports on the local firewall. If any other ports become open, an alert will be sent.

Back to Top

– Environment sensors

An optional sensor can be connected to the agent. One or multiple metrics can be configured to receive alerts. Please see the sensors section of the FAQ.

Back to Top

– Speed test

Allows user to enable or disable speed testing.

Back to Top

– Settings

Nickname: User can set a nickname which is seen in the dashboard to help quickly identify this particular agent. 

Minimum Outage: This allows the user to see the minimum outages which will be displayed in the dashboard. This option is for convenience in certain situations but most will want to see all outages no matter what length. Note that even if outages are not being shown in the reports, they are still added to the reports and accumulate. The agent is always reporting any and all outages and ‘hiding’ by using the Minimum outage settings may cause someone to overlook important issues.

Enable DDNS: Dynamic DNS or DDNS, offers an easy to remember name for the agent. When enabling DDNS, the URL generated will show in the dashboard in the Links section. Static IPs can cost around $10 to $15 per month but as a FailPoints member, DDNS is included.

DDNS name: The default DDNS name is the agent ID number plus a domain name. Ie, 12955.domain.com. User can set a custom name if it is available. 

Enable data usage: Alpha code being tested to allow the agent to report on its bandwidth and data usage.

Back to Top

Details menu

The details menu displays information about the agent including some of which is shown in the dashboard itself. The org manager may see items which are not available to users. These details are mainly used by FailPoints support.

Back to Top

Notifications menu

Please do not flag notifications as spam
Online services face a challenge. People sign up to see how things work then often flag their own  notifications as spam rather than logging in to disable them. This behavior can cause others using the same mail service to experience problems. Please flag responsibly and disable notifications on the service if they are no longer wanted.

Notifications functionality
FailPoints email notifications encompass both a possible outage and/or an inactive agent situations. This is done in order to limit the number of emails sent to the  inbox. As a member that has taken the time to sign up, we make the assumption that someone wants to know about events occurring related to monitoring. Rather than sending constant emails, a fixed number are sent at a time so that a user can log into the dashboard for more detailed information.

Outages vs Inactive agent
Inactive notifications are meant to alert the user of any important event. If the event is an outage, the agent sends its report as soon as it can reach the Internet again. If the event doesn’t eventually show an outage, it means that the agent became disconnected from the Internet. Something was turned off, disconnected, bad wiring, any number of things including signal level problems if the connection is being provided by a cable based ISP.

IMPORTANT: Never reboot an agent during an inactive notice because if there is an outage in progress, that report will be lost and not sent.

Back to Top

Manage menu

When installing and activating hardware agents immediately become part of the organizations inventory. If an agent is damaged or is permanently removed from its monitoring location it will no longer send data or update its reports. However, its reports will not be removed until they are deleted from the inventory using the delete option.

Reset
If there is a need to reset an agents reports, this option clears all previous data and resets its reports as if it were freshly installed.

Delete
This option completely removes all reports for an agent. This is an irreversible function meaning that this agent will be removed from inventory along with all of its reports.

Re-using this hardware agent is possible but may require having to contact FailPoints support in order to have the agent prepared for a new activation or relocation. In rare cases, this may mean having to send the agent back to FailPoints.

Back to Top

How many agents

Any number of agents can be installed in an organization. Some may have dozens, some may have many thousands. Filtering using a variety of values is possible when dealing with very large numbers of agents. Users can filter by city/state, zip/postal code, streets and other values. If needed, multiple organizations could be created to more easily manage various locations and large numbers of agents.

Back to Top

Wrong date/time

By default, all agents are configured using UTC date/time. Agents must have the correct time zone set to prevent inaccurate reports. The date/time are converted to the local time of the agent in the dashboard. This allows installing agents in different geographical locations while having the ability to see the correct local date/time of the reports. This should be kept in mind when viewing agent reports which are in different time zones.

To set the correct time zone, click on Configure then Address and geo location. Enter a part or the full address then click on ‘Auto fill address details below’ and the system will pre-fill all columns along with a time zone. This method ensures a consistent method of setting agent locations which can be properly filtered and more. Note that street addresses, while shown in settings, will never be displayed in any reports.

Is the router/firewall blocking outgoing NTP?
If the router blocks UDP port 123 (NTP), reports will show UTC time rather than the correct date/time and/or may never update, showing the same date repeatedly until port 123 is allowed out to the Internet. Please ensure that the router is not blocking outgoing NTP, port UDP 123 services regardless of using hardware or software agents.

Note that data which has already been stored cannot be updated. Once the problem is fixed, new data will show correct date/time.

Back to Top

Excessive outages

Large amounts of outages could be displayed if there are problems because the agent is very sensitive and able to detect interruptions as short as milliseconds. The default outages length for FailPoints is set to 1+ seconds to prevent extremely large amounts of data being sent.

Short outages are normal in IP based communications but ongoing excessive outages are indicative of problems which need to be addressed. Most of these problems are related to hardware failure, programming errors, wires failing, cut or disconnected momentarily and countless other things. 

Short outages typically do not affect most services and humans may not notice them until they start affecting Voice over IP phones, video conferencing and other real time services where quality and reliability could degrade.

While the user can ‘hide’ these from the agents reports, the outages will continue being received and accumulated into their reports. When such condition occur, finding and fixing the problems is the best recourse.

Back to Top

Classification algorithm

The FailPoints algorithm is quite good at determining what is LAN, what is provider and what is beyond. However, there are some things to keep in mind in terms of networking and monitoring. The providers responsibility typically ends at the street and not necessarily in the customers premises The LAN is the owners responsibility and the providers is from the router to the street and beyond.

Internet connectivity works by hops. In most situations, the local router, also known as the gateway, is hop 1. The next one or several hops will be the providers network. This is of course assuming a typical setup which does not have multiple internal routers before getting to the provider. The local IT people will have to know where the local network ends and where the providers starts. Both may be using private IPs.

The following private IP ranges could be in use on the local network and beyond into the providers network.
10.0.0.0 – 10.255.255.255
172.16.0.0 – 172.31.255.255
192.168.0.0 – 192.168.255.255

When mousing over each outage event, details about the outage will be displayed. The first few hops on the LAN to the router/modem will usually be using one of the IPs ranges above but the provider may also be using private IPs. In fact, in most cases, this is the situation. There could also be a mix of private and public IPs depending on the service and if there is a static IP.

The FailPoints algorithm will try to use all data available to determine the network topology in order to classify outages and which network they occurred on. However, in some cases, IT personnel will have to do this for themselves if the algorithm is unable to. Once local and provider hops are determined, the rest becomes clearer with the details provided by FailPoints.

If the issues are local they will need to be taken care of by someone at the location. If the issues are with the provider, generated connectivity data can help get the problem solved. If issues are beyond the provider, there is not much that can be done unless the problems are severe and costing the customer in some way so that a complaint can be sent to the upstream.

The biggest mystery about Internet problems is not knowing if one location only or others in the area were affected. If the organization has multiple locations being monitored in the area or if others nearby are motivated to use FailPoints, everyone monitoring can gain a better understanding by correlating reports. The FailPoints philosophy is that one location experiencing problems can easily be dismissed if there is no evidence but many experiencing the same things cannot.

Note that in Internet (TCP/IP) communications there must be a source and a destination. Agents maintain tests between their locations and one or more destinations across the Internet. As a reminder, reports showing problems beyond providers are mainly informational since there is little control beyond the provider unless a complaint is filed directly with that company.

Back to Top

Privacy is important

The monitoring software does not and cannot collect personal information such as Internet locations visited, connected to, search results and so on. The device is connected as a LAN client without access to packets from other devices. FailPoints agents monitor only Internet connectivity and not the packets/data flowing.

Please see our Privacy Statement for more details.