Tag Archives: Spreadsheet Dependence Tracing

Presenting Aragorn: A Floating Prompt System for Spreadsheets and the AraSense Dependence Tracking Feature

aragorn

So what (who) is Aragorn?

We all know who Aragorn is! The mighty brave king of Men, Isildur’s heir, the skilled warrior, protector of the Fellowship of the Ring and the hobbits. Unfortunately we are not talking about him here. We are in the process of developing an Excel add-in tool, which quite similar to the real Aragorn, will help you protect yourself from orcs, goblins, and such nasty creatures (mistakes, errors, pitfalls) while you work on your spreadsheets.

What made us develop Aragorn?

We were trying to have a closer look at spreadsheet dependence tracing techniques. In my blog post I discussed some of our findings from a review of the research on this topic. We found that there was still enough need for good and reliable dependence tracing techniques which can be adequately and effectively adopted by the industry in a standardized and popular way. In 2013, in her dissertationDr. Felienne Hermans wrote about some responses from spreadsheet users –

Quote

We were particularly motivated by a scenario where you, as a spreadsheet user, have the task of coming up with some calculation or financial estimate after some data crunching in a fairly large complex spreadsheet which you have not created yourself. Imagine, to obtain the required results you need to change some parameter values, modify some cells, move data around here and there. However not knowing well the full design structure of the spreadsheet you definitely do not want your changes to affect the whole model so that the results you obtain turn out to be wrong! On the other hand perhaps it´s not even worth it for you to spend days in trying to understand the spreadsheet model as it may be for a one-off use in your case. This is exactly the type of scenario where a quick, real-time prompting system displaying cell dependency information can make things a lot easier. While this was our primary motivation, we also wanted to make sure that the tool was easy and lucrative enough for you. Therefore we also aimed to ensure that the following key features were there in our tool:

  1. Effortless: We did not want to burden you with additional effort of multiple mouse clicks or peering into the screen trying to distinguish between overlapping blue arrows which all seem to point towards the same cell or no-where.
  2. Real-time: We wanted our system to be a process that is part of using the spreadsheet itself and not a separate activity to be done at a different time.
  3. Instantaneous: We wanted our system to be quick and momentary so that you do not have to wait for the information.
  4. Specific: We did not want to burden you with more information than you really need; it’s not always necessary or worthwhile to understand the whole structure of a spreadsheet when all you need to know is whether a change in one or two cells are going to affect other cells.

So what have we come up with?

The core of Aragorn is the floating prompt system. With these localized text-boxes very close to active cells, we can display crucial information about the selected cell in a transient manner.

5

In addition to this we have developed the feature called AraSense particularly to cater the need of dependence tracking especially for scenarios like I described above. It is our answer to the questions raised by the users.

We all know how the real Aragorn is also an expert ranger with superior tracking skills. He could track the trail of hobbits when they almost gave up hope of finding them alive!

 

download
Aragorn tracking hobbits’ trail.

Similarly the AraSense feature is the tracking power of the Aragorn tool. It can pre-sense dependencies and thus its called AraSENSE. It can do a one-time analysis of an opened Excel workbook, after which it will display a list of dependent cells in the floating prompts for every cell you click on. If you are not trying to edit the cell, normally the prompt will disappear automatically after a delay. If you are trying to edit a cell, the prompt will persist next to it as long as you are editing the cell. Even if the dependent cells are in a different worksheet, they will be listed in the prompt. You can turn the feature on or off based on your preference. No warning is shown for cells that do not have any dependents. Thus when the feature is turned on, if there is no prompt for a cell, then you can safely modify its contents without worrying. And if there is a prompt then you know which are the dependent cells and on which sheet they are. Thus you can be cautious and forewarned about the changes you make. The AraSense feature is effortless, real-time, instantaneous and specific. It fulfills all the aims we had about the system we wanted to give you.

So how it works?

 

So what we would like from you?

We would like to hear your thoughts on Aragorn and AraSense.

If you are a professional who uses spreadsheet at work, then do you think this tool will be helpful for you?

If you are not a spreadsheet user, but interested about end-user computing and works with usability aspects then do you think our tool addresses some key usability aspects?

Whoever you are, if you have read this and found it worthwhile of your time, we would like to thank you and would really like your thoughts and opinions. Encouragements are obviously appreciated but we would also eagerly welcome criticisms and suggestions for improvements.

New paper: Reviewing Research on Spreadsheet Dependence Tracing Techniques

 

Hi all. In my blog post Dependence Tracing: Need for an Investigation I talked about what is dependence tracing in spreadsheets and what is the necessity to review existing techniques to do the same. We did a study of the research that has been done on the topic and I am happy to inform you that the paper documenting the study and its conclusions is going to be published in the proceedings of SEMS’14. In this post I briefly highlight some key aspects of our findings. I recommend my previous post  Dependence Tracing: Need for an Investigation for those interested to have a little background information on the problem of dependence tracing.

We were interested particularly to understand the state of the art of research on dependence tracing. Our principal motivations arose from feedbacks we had obtained in previous user studies and surveys where there was a clear dissatisfaction among the industrial spreadsheet users about available dependence tracing tools and enhancements. To put very frankly it is an unanimously accepted fact that MS Excel is by far the most popular spreadsheet application used in the industry today. The feature that is available for dependence tracing in-built with MS Excel is the blue tracing arrows that points to precedent or dependent cells. This technique evidently is not the most convenient one for all types of scenarios that arise for an end user when she is creating or modifying spreadsheets. It has some known issues like:

  • wastage of effort and time (one mouse click too many),
  • ambiguity (arrows pointing to cells of the same row end up superimposing on each other creating confusion)
  • interference (it does indicate the dependents but it also hampers the user’s visual clarity by creating cluttered arrows all across the worksheet) that makes it inconvenient to both work on the sheet and get idea about dependence structure simultaneously.

TracingDependents

 

Therefore it was our hypothesis under the paper’s context that better dependence techniques or tools developed under scientific research was not adequately making it across to the industry to benefit real-life spreadsheet users. Based on this hypothesis we identified three research questions:

  1. Why the research ideas are not getting across to the industry as popular and user-friendly implementations?
  2. Is there any standardized metrics framework with which such implementations as above can be compared against each other in their effectiveness to real-life spreadsheet users?
  3. Is there any clear opportunity of improvement in the dependence tracing context?

In order to find answers for these questions, our first step was a review of the existing research literature on this topic and this preliminary review is practically what makes up this paper.

We shortlisted five published research papers for our study, each having description and details of some innovative idea for dependence tracing. Our approach was to characterize these innovations based on the following aspects:

  • The basic technique/principle/strategy
  • Tools or prototypes developed if any
  • Comments or details available on testing, performance, and limitations of the tool
  • Current status of the research and its implementation, and its perceived relevance or influence in the industrial scene
Paper Published In Technique Prototype Prototype Comments Industrial Relevance
  1. 1998
Fluid Visualization & Animation Pad++ (now defunct) and Python for UNIX Performance drastically worsens after 400 cells.No user study. No implementation plan given. Some ideological influence can be observed in MS Excel 2007 onwards.
  1. 1999
3D Interactive lift-up operation OpenGL APIs and SLSC for UNIX None available.  None.
    3.  2000 Visual indications superimposed on Excel sheets like arrows, highlights, shading. VBA plug-in for Excel. None available. No standardized definite influence. However similar techniques are used in ad hoc basis by expert spreadsheet users on individually or internally to organizations.
  1. 2003
Different types of graphs and trees depicting spreadsheet dependency structure. Java tool for Excel workbooks. Successfully tested on a corpus of 259 Excel workbooks of undisclosed source.No user acceptance or comprehension study. Authors apparently did not carry forward their work on this project.
  1. 2011
Leveled dataflow diagrams C# toolkit that accepts Excel workbooks for input. Extensive user testing with 27 industrial spreadsheet users in a financial asset management company. Is currently a part of service offered by the company Infotron.nl

 

Thus, from this review we observe that:

  1. All 5 papers provide innovative ideas to simplify dependence tracing with visualization aids.

=>  There is no lack of ideas in this context.

  1. All 5 papers provide either tools or their prototypes to demonstrate the techniques.

=> The ideas are quite conceivable.

  1. 2 out of 5 prototypes are based on defunct or obsolete technology. They have not been propagated to newer current technologies. One that is based on VBA (made in 2000) is supposed to work for all Excel versions but is actually subject to be tested for version incompatibility. Only 2 out of 5 are made with Java or C# and can be considered as modern enough to actually be working at present.

=> There is lack of recent work done based on current state of technology that can be used at present in the industry.

  1. 3 out of 5 were properly tested. One showed very poor performance as 400 cells is nothing compared to the number of cells that is found in spreadsheets used in the industry! No solution or approach was provided about how to scale this up.

=> All ideas that are conceived need to be tested for performance. If that is not done, then it is evident why it will not find itself across to the industry and actually help real-life spreadsheet users.

  1. Only 1 out of 5 was tested for useability and user-acceptance!

=> Innovative ideas are all exciting, but they are meaningless if actual users don’t find them helpful. Without user studies the effectiveness of ideas or research can not be established.

  1. Only 1 out of 5 actually ended up having industrial exposure. All the rest were practically aborted or left off by their creators.

=> There must be serious issues in making these tools actually popular to the real life spreadsheet users; or else the researchers working in the domain are not properly motivated to do anything about the problems.

 

We can now revisit our research questions taking into account the above observations:

  1. Why the research ideas are not getting across to the industry as popular and user-friendly implementations?

Although still not quite clear as to what is the reason behind this, one reason seems to be just plain lack of motivation on part of researchers. Otherwise why not follow up on your innovation and see to it that it gets implemented. A reason behind this reason can also be the pressure in the academia which encourages writing tons of papers on new ideas but does not give equal limelight to follow-ups and industrial implementations!? This may cause researchers to abandon their projects at the conception stage and move on to newer projects.

Another reason seems to be difficulty in performance scaling. Spreadsheets are tricky objects with sizes varying from small to very large. The range of sizes is quite phenomenal. This might be a cause due to which innovative ideas are found difficult to be put into real practical use? This reason is indicated by the fact that most of the projects we covered in the review were not extensively tested.

  1. Is there any standardized metrics framework with which such implementations as above can be compared against each other in their effectiveness to real-life spreadsheet users?

Evidently there is none. And no wonder why only one of the projects was tested extensively with a group of real-life spreadsheet users. This is a very important necessity. To come up with a set of standards and metrics that can define useability or user-friendliness of spreadsheet aid applications. In that way it will be easy to establish the true worth of innovative techniques as laid against backdrop of effectiveness to real-life users.

  1. Is there any clear opportunity of improvement in the dependence tracing context?

Based on our review and the observations obtained, it is doubtless that this question is still a very much wide open one. There can be enough opportunity found to research and come up with new tools that actually make a difference in the spreadsheet industry in form of reliable and user-friendly implementations. And not only ones that are created in ad hoc basis for internal use in organizations but robust standardized ones that can be widely adopted through out the industry.