Date   

Question about AddonFiles repository

José Manuel Delicado Alcolea
 

Dear NVDA community,

Since I'm part of this mailing list, I have followed the development and review process of many add-ons. I have learnt a lot from these processes, but there is something that I don't understand at all: why some users have write permissions on AddonFiles repository and others don't. For example, let's see the most recent messages on this thread: https://nvda-addons.groups.io/g/nvda-addons/topic/online_ocr_addon/30413942?p=Created,,,20,1,100,0::recentpostdate%2Fsticky,,,20,2,0,30413942#

If you don't know it, AddonFiles is the BitBucket repository where download links (and in case it's needed, some add-on files) are stored. Most add-on authors have many skills with Git. However, I understand that they prefer that a reviewer updates their links, because if get.php file has an error, an important part of the website gets completely broken.

However, if an author wants to maintain their add-ons, he can request access to AddonFiles repository and most times it is granted. This is not my case.

I have requested at least twice write access not only to AddonFiles, but also to the Bitbucket copies of my two add-ons, Enhanced Aria and Input Lock. I have a Bitbucket account, experience with Git, and I don't want to depend on a reviewer each time I want to release an update, specially when it is minor.

So, here is my question: which professional and objective criteria do the reviewers follow when they grant write privileges to add-on authors on Bitbucket? If the answer is that each reviewer decides what to do after passing the bassic review to the reviewed add-on, I see this situation quite unfair. All add-on authors should have the same resources and play under the same game rules!

I will request these permissions one more time, I hope this time I get them. My BitBucket username is jmdaweb.

Best regards.

--

José Manuel Delicado Alcolea
Administrador y editor en la web nvda.es
Twitter: @nvda_es
Certificado en el programa NVDA Expert 2019

Logo NVDA Certified Expert


Re: Online OCR addon #addonrequestreview

Noelia Ruiz
 

Hi, yes, you can update the .mdwn directly, respecting tags like meta title, the list of the begining with your name, compatibility info and, for now, the development link, and the dev tag at the bottom.
About the issue, ok, I will download the add-on when it's fixed. Also, maybe you can use core.callLater function, just an idea.
Also, for any reason, when recognizing with OCR, the message used seems to be stored in NVDA's core since it's translated into Spanish, but I see this is not happening for image describers, not OCR.

Cheers

El 28/03/2019 a las 14:21, Larry Wang escribió:
Hi, Noelia,
I have updated the link in get.php. I have an svn translation account too.
How can I update the documentation? May I update MDWN file from translation svn directly?
This issue seems to be caused by calling ui.message from networkThread.
I will try to refactor engine code and move ui.message else where
maybe I can use queueFunction instead of calling it directly.
On 2019/3/28 20:55, Noelia Ruiz wrote:
Also, the swap setting seems not work for image describers, just for OCR.
This is a bug when recognizing the Google Logo:

ERROR - stderr (13:52:14.867):
Exception in thread Thread-8:
Traceback (most recent call last):
   File "threading.pyo", line 801, in __bootstrap_inner
   File "threading.pyo", line 754, in run
   File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\winHttp.py",
line 115, in postContent
   File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\winHttp.py",
line 109, in doHTTPRequest
   File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\onlineOCRHandler.py",
line 514, in callback
   File "ui.pyo", line 67, in message
   File "braille.pyo", line 1795, in message
   File "braille.pyo", line 1807, in _resetMessageTimer
   File "wx\core.pyo", line 3305, in Start
wxAssertionError: C++ assertion "wxThread::IsMain()" failed at
..\..\src\common\timerimpl.cpp(60) in wxTimerImpl::Start(): timer can
only be started from the main thread


2019-03-28 13:35 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi Larry, I have given you write access to addonFiles repo. Please
clone it, see get.php files and the entry for your add-on, identified
as oid
There you can replace the redirected link to the updated URL.
After job I will update documentation on the website if you can't.
Clone at
git clone https://bitbucket.org/nvdaaddonteam/addonfiles.git
Cheers


2019-03-28 11:20 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi Noelia,

If Robert's issue is fixed, I think it is better to post 0.13.

I have a bitbucket account and I am willing to post versions by my self.

Actually this addon was on a private repo on bitbucket before I put it
on github.

This is my profile page.

https://bitbucket.org/Rheinmetal/

On 2019/3/28 17:37, Noelia Ruiz wrote:
Hi, let us know if we should post 0.12 or 0.13 with fixed profile issue.
Also, for add-ons reviewed by me, if authors show Git experience, I
try to offer write access to addonFiles repo if they have a Bitbucket
account, so authors can update versions themselves on the website, if
they want. Otherwise, we can do it.
So, if you have a Bitbucket account and want write access to
addonFiles, let us know. Just be patient, since Bitbucket is not very
comfortable and provide this access can be hard and, at least in my
case, sometimes I need to retry it several times.

Cheers


2019-03-28 9:51 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi Noelia

1. Documentation problem mentioned above is corrected in 0.12

2. Access type is valid for Oliver's engine. That is because I provided
a proxy for users without access to google, by default it connects
directly to google .
3. I will look into how to use a checkbox. The checkbox in advanced
panel may be a good candidate.


On 2019/3/28 15:10, Noelia Ruiz wrote:
Also, the plugins based on Oliver's work seems not to need an api key.
Then I think that the combo box should be disabled, since the combo
box label refers to api key, but options correspond to proxy.
About the details supported by plugins like Azure Analizer, I think
that a list of checkbox may be better than a combo box.
Also, I saw that in some part imageInfo appears as not defined. This
is normal, since the add-on is very complex and have a lot of plugins.
Anyway, we can update the version and formatted documentation easily.

Cheers


El 27/03/2019 a las 23:05, Noelia Ruiz via Groups.Io escribió:
Hi, your suggestion sounds good. I don't use numeric keys and use
laptop key with bloc uppercase as NVDA key, so I don't have problems
with combinations, but I agree with you.

Cheers


El 27/03/2019 a las 22:16, Robert Hänggi escribió:
Hi Noelia,
I would swap the two gestures. NVDA+Alt for the normal functionality
and NVDA+Shift+Control for the clipboard because pressing three keys
is hard enough, no need for 4. Especially if you want to perform a
double gesture.

I wonder, shouldn't the NumPad be used, if available, after all, the
gestures for object navigation are situated there.
And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
(I've seen the latter one in the dragAndDrop addOn but I don't know
if
that is still maintained).

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, another suggestion:
NVDA+control+shift are used, when pressing r, to object navigator,
and
when pressing p, for clipboard images. And alt+nvda, for navigator
object in case of image descriptor, and with r for clipboard. This
may
be inconsistent. Maybe better to use control+nvda+shift for
navigator
object and nvda+alt for clipboard?

Cheers


El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
Hi again, the add-on is on the website. Anyway, I recommend you to
format a bit the documentation, for example, you may want to put
The
first heading level 3 after level 1 to 2, and also fix lists,
since
after the log option the list finishes, etc.
I have put asterisks at start, so that compatibility info, author
and
download links appear in a list, as done with other add-ons on the
website.
Also, the link using addonFiles address is used, to avoid issues
in
translations.
When you declare the add-on as stable, we can register it to be
translated. Now translators can start with documentation.
https://addons.nvda-project.org/addons/onlineOCR.en.html

Cheers

https://addons.nvda-project.org/addons/onlineOCR.en.html
El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
Hi, basic review results after adding image describers:

- License and copyright: Pass.
- Security: pass.
- Documentation: Pass.
- User experience: pass with comments.

Image describers can provide text results (not in browse mode),
and
this has to be fixed. Also, when pressing the gesture twice, NVDA
can
report that there is active another recognizer.
Anyway, seing that author is listening and responding feed-back,
and
that Oliver work has been used and mentioned in documentation,
and
moreover the code shows a deep experience looking at NVDA
relevant
code, and also that the main structure is done, and fail refers
to
part of some features, and that this should be tested in
different
languages, I think we can post it in the development section of
website to share widely in different communities, since the
add-on is
useful now and will be improved.
Here is my log. Since no objections are expressed, I will post
the
add-on now, and this is the log for emage describer errors:

INFO -
external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(19:17:16.818):
(u'C:\\Users\\User\\Downloads\\ant.jpg',)
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}



El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
Yes, I have seen an error with text result in browse mode with
the
image describer plugin. Anyway, I think it has a lot of feedback
and
just it could be added to development section of website, since
the
major part is done, it has passed basic review, the quality in
general is good, and no major changes in documentation are going
to
happen for now. If you dont have objections, I will post this in
about an hour for wide testing, and if minor version are
released
they will be posted, but now translators can start to translate
documentation so that different communities can test the add-on
with
some quality and feedback, as a first filter.
Cheers""

Enviado desde mi iPhone

El 27 mar 2019, a las 17:43, Robert Hänggi
<aarjay.robert@gmail.com>
escribió:

Hi
I'm glad that the addOn makes such huge steps.

The browsable message doesn't seem to work for NVDA+Alt+P if I
swap
the gestures.

The description might need some formatting, at least some line
breaks.

That's the result with azure:

Categories: others_ outdoor_ text_sign
This image does not contain adult content This image contains
racy
content
Dominant foreground color is Red. Dominant background color is
White.
Hex code of accent color is 000000. Dominant colors: White The
image
is not black and white.
Tags: screenshot design pixel vector typography
The image is not a clip-art. The image is not a lineDrawing
Descriptions: a close up of a device

(Note there is no racy content, LOL. It's the record button of
Audacity.)

Empty tags could be omitted.
The accent colour could perhaps be represented as RGB or
expressed
with the NVDA colour descriptions.

Anyways, great work.

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, I will try to post the development version of this
wonderful
add-on on the website after job, in the evening, in about 2
hours or
so.
Personally, I feel a deep emotion for this. Of course, this
doesn't
replace a human description where we can ask questions and so
on.
I think this is an open door also for students in scientific
degrees,
where people can have a lot of problems and some of us have
listened
things like: "Why don't you try to drive a car?". Since some
things
can be identified by machine learning, or at least this can
help us,
and this is very important.
I have a lot of job and hope other reviewers can help with new
add-ons too.

Cheers


2019-03-27 16:15 GMT+01:00, Larry Wang
<larry.wang.801@gmail.com>:
Hi everyone,

Version 0.11 of online ocr addon is released.
Changes in this version include:
Change addon summary to online image describer
Added image description capability

NVDA+Alt+P Recognize current navigator object Then read
result. If
pressed
twice, open a virtual result document.

Control+Shift+NVDA+P Recognizes image in clipboard . Then
read
result. If
pressed twice, open a virtual result document.
Here are three engines available.

### Machine Learning Engine by Oliver Edholm
It's a free engine gives description of an image.
If there is text inside it will do OCR on the image.
There are two settings for this engine.
* Language of result:
English by default. If you configure another language than
English, the
description could have translation issues because it's
automatically
generated by machine translation service.

Security:
* The images are sent to a script hosted on the Google Cloud
Platform for
analysis. After the analysis the image gets removed from the
server and
will never be seen again.

The author of this addon has setup a proxy on www.nvdacn.com
for
users
who
cannot access google service access.
If you want to use this proxy please chose Use proxy on
www.nvdacn.com in
access type settings.

If you want to use your own key in the following two
Microsoft
engines.
Please follow the guide in Microsoft Azure OCR section.
### Microsoft Azure Image Analyser
This engine extracts a rich set of visual features based on
the
image
content.
This engine is english only by now.

Visual Features include:
Adult - detects if the image is pornographic in nature
(depicts
nudity or
a
sex act). Sexually suggestive content is also detected.
Brands - detects various brands within an image, including
the
approximate
location. The Brands argument is only available in English.
Categories - categorizes image content according to a
taxonomy
defined in
documentation.
Color - determines the accent color, dominant color, and
whether
an image
is black&white.
Description - describes the image content with a complete
sentence
in
supported languages.
Faces - detects if faces are present. If present, generate
coordinates,
gender and age.
ImageType - detects if image is clip art or a line drawing.
Objects - detects various objects within an image, including
the
approximate location. The Objects argument is only available
in
English.
Tags - tags the image with a detailed list of words related
to the
image
content.

Some features also provide additional details:

Celebrities - identifies celebrities if detected in the
image.
Landmarks - identifies landmarks if detected in the image.

### Microsoft Azure Image describer

This engine generates a description of an image in human
readable
language
with complete sentences. The description is based on a
collection of
content tags, which are also returned by the operation. More
than
one
description can be generated for each image. Descriptions are
ordered by
their confidence score.
There are two settings for this engine.
* Language
The language in which the service will return a description
of the
image.
English by default.

* Max Candidates
Maximum number of candidate descriptions to be returned. The
default is
1.


Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon






Cheers,
Larry



On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz
<nrm1977@gmail.com>
wrote:

In case it's useful, more suggestion:
- In the documentation presented in input mode for
NVDA+shift+control+r
announces that recognizes the content and opens a virtual
result,
but
this is configurable. This may be mentioned indicating that
the
result
could be spoken and braillified or presented in a virtual
document.
- Regarding the code, I have seen that sometimes it's used
something
like:
if something: return
else...
But I think in some cases else after return is not needed,
since
if the
previous condition exists, the script return and no more can
happen. I
made this mistake years ago and Mesar Hameed, a great person
and
developer, who created initially the scripts used for the
translation
system and make add-ons translatable by NVDA's translators
exchanging
messages between add-ons stored on Bitbucket team account
and the
translators repository, also first creator of guidelines for
reviewers
and authors and so on, fixed me teaching this mistake made
by me,
in
case you are interested.
When you want, for example when image descriptions are
added, we
can
post the add-on on the website, and if this is joined with
ImageDescriber, we will fix things later. I say this since
maybe
complicated to join these add-ons, imo, at least currently,
since
the
two interfaces are very different. Anyway, if you want to
wait to
post
the add-on, this is OK.


Cheers

El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io
escribió:
Sorry, another thought: I think that Larry could accept
pull
request
if the features contained in ImageDescribed are properly
integrated in
online_ocr. For example, I don't know a good idea just to
join
the two
add-ons, and I would understand that for any reason this
could
be done
and they remain as independent add-ons. Please don't feel
any
pressure
for our part (mentioning Robert and me).
Just for clarify

2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi, this is great news. I think that Larry may accept pull
requests,
since the add-on gui seems to be flexible, accepting
profiles,
integrated in NVDA settings, and accepting content recog
messages,
etc. Also, I think that creating pull request in this
add-on
can be
easy, using the abstract class for plugins. So, imo, if
both you
agree, I would vote for adding just one joined add-on on
the
website.
ImageDescriber could be a better name since it's more
generic,
though
I definitely would use Larry's UI and framework. What do
you
think?

Cheers and thanks both for your work.

2019-03-22 11:25 GMT+01:00, Robert Hänggi
<aarjay.robert@gmail.com>:
Oliver, I'm very glad that you're open to collaboration.
I think you're joined efforts would lead to a great
add-on.
Apart from that, it may help to avoid clones of 3rd party
modules
present in both add-ons and thus reducing storage and
load time.

I could perhaps also help with mouse related stuff (e.g.
recognition
of unknown pointer shapes automatically).

Cheers
Robert

On 22/03/2019, Oliver Edholm <oliver.edholm@gmail.com>
wrote:
Hi! It's the creator of the Image Describer add-on.
Looked a
little
bit
at
the code and it looks very well made.

I agree that we do similar things here. Yesterday I
added OCR
to
the
Image
Describer backend for example (without knowing about
this
add-on).

I'm also working on a free Captcha Solver which seems to
be
something
Larry
has also looked into.

Also regarding the example of utilizing more methods
from AI to
these
problems it's also something I've been working on. You
gave the
example
of
icon classification, I trained a classifier for this ~2
weeks
ago
which
intend to eventually add to the Image Describer add-on.

I'm definitely open for some sort of collaboration here.

On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:


Wouldn't AI conflict with the objectDescriber add-on?
I mean, two add-ons with essentially the same
functionality
is a
bit
strange.

However, I see certainly positive results when AI is
integrated.
Imagine a inaccessible application. A lot of times, it
does
not
only
use text but also symbols and it would be great if
general
once
could
be recognized.
- disc icons
- cog wheel/ wrench icons
- transport controls: play/fast forward/rewind icons
- home icon
- triangles (for drop down menus)
- check boxes, radio buttons, folder icons (= browse
for
folder)
and
so
on.

Additionally, simple rectangles which could indicate
form
fields
and
edit boxes. Compare the auto form detection used in
Acrobat
Pro DC

(By the way, I make use (in my version of golden
cursor)
changing
mouse icons to determine e.g. the boundaries of an edit
box or
scroll
bars or sliders.)

Another thing that I would like to see:
If one presses enter in a recognition result, the
default
action
is
performed. So far so good but here are some possible
improvements:
Case 1, when the recognition result stays open.
- perform a new recognition and update the document to
reflect
possible
changes.
Case 2, when the navigator object changes and the
document
closes.
- if the new navigator object is not a known class
(e.g.
window or
system client or generic, without children) do as for
case 1.

Best
Robert



On 21/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:

control nvda shift i is assigned to readfeeds add-on,
but
feel
free
to
assign it since readfeeds is an add-on, maintained and
initially
creadted
by
me (smile).

Enviado desde mi iPhone


El 21 mar 2019, a las 8:17, Larry Wang
<larry.wang.801@gmail.com>
escribió:

AI recognition is not hard to add since it does
similar
operation
on
an
image, I have found several engines too. But we need
another two
gestures,

is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I
unassigned?


On 2019/3/21 12:53, Noelia Ruiz wrote:
OK, let's wait some hours, and tonight or tomorrow I
will
try
to
post
the
add-on on the website if all is OK. For now I will
upload
the
dev
version
with one link, and when you declare the add-on as
stable we
will
post
the
link to stable version too and will register the
add-on in
translation
system.
I would like if you coul add AI recognition of
images, for
instance
if
they contain a man, a room, blond or brown hair,
etc.
This is a wonderful work!

El 21/03/2019 a las 1:47, Larry Wang escribió:
Hi everyone,

I am happy to release version 0.10 of online ocr
addon.
Changes in this version include

Fix error using user's own api key in sougou API

Fix unknown panel in sougou API settings

Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon





Cheers,
Larry








Re: app modules not working after updating to 2019.1

Pettyjohn, Chris G. (FTC)
 

Thanks Derek,  that is what I'm doing. 


Re: Online OCR addon #addonrequestreview

Larry Wang
 

Hi, Noelia,

I have updated the link in get.php. I have an svn translation account too.

How can I update the documentation? May I update MDWN file from translation svn directly?

This issue seems to be caused by calling ui.message from networkThread.

I will try to refactor engine code and move ui.message else where

maybe I can use queueFunction instead of calling it directly.

On 2019/3/28 20:55, Noelia Ruiz wrote:
Also, the swap setting seems not work for image describers, just for OCR.
This is a bug when recognizing the Google Logo:

ERROR - stderr (13:52:14.867):
Exception in thread Thread-8:
Traceback (most recent call last):
File "threading.pyo", line 801, in __bootstrap_inner
File "threading.pyo", line 754, in run
File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\winHttp.py",
line 115, in postContent
File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\winHttp.py",
line 109, in doHTTPRequest
File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\onlineOCRHandler.py",
line 514, in callback
File "ui.pyo", line 67, in message
File "braille.pyo", line 1795, in message
File "braille.pyo", line 1807, in _resetMessageTimer
File "wx\core.pyo", line 3305, in Start
wxAssertionError: C++ assertion "wxThread::IsMain()" failed at
..\..\src\common\timerimpl.cpp(60) in wxTimerImpl::Start(): timer can
only be started from the main thread


2019-03-28 13:35 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi Larry, I have given you write access to addonFiles repo. Please
clone it, see get.php files and the entry for your add-on, identified
as oid
There you can replace the redirected link to the updated URL.
After job I will update documentation on the website if you can't.
Clone at
git clone https://bitbucket.org/nvdaaddonteam/addonfiles.git
Cheers


2019-03-28 11:20 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi Noelia,

If Robert's issue is fixed, I think it is better to post 0.13.

I have a bitbucket account and I am willing to post versions by my self.

Actually this addon was on a private repo on bitbucket before I put it
on github.

This is my profile page.

https://bitbucket.org/Rheinmetal/

On 2019/3/28 17:37, Noelia Ruiz wrote:
Hi, let us know if we should post 0.12 or 0.13 with fixed profile issue.
Also, for add-ons reviewed by me, if authors show Git experience, I
try to offer write access to addonFiles repo if they have a Bitbucket
account, so authors can update versions themselves on the website, if
they want. Otherwise, we can do it.
So, if you have a Bitbucket account and want write access to
addonFiles, let us know. Just be patient, since Bitbucket is not very
comfortable and provide this access can be hard and, at least in my
case, sometimes I need to retry it several times.

Cheers


2019-03-28 9:51 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi Noelia

1. Documentation problem mentioned above is corrected in 0.12

2. Access type is valid for Oliver's engine. That is because I provided
a proxy for users without access to google, by default it connects
directly to google .
3. I will look into how to use a checkbox. The checkbox in advanced
panel may be a good candidate.


On 2019/3/28 15:10, Noelia Ruiz wrote:
Also, the plugins based on Oliver's work seems not to need an api key.
Then I think that the combo box should be disabled, since the combo
box label refers to api key, but options correspond to proxy.
About the details supported by plugins like Azure Analizer, I think
that a list of checkbox may be better than a combo box.
Also, I saw that in some part imageInfo appears as not defined. This
is normal, since the add-on is very complex and have a lot of plugins.
Anyway, we can update the version and formatted documentation easily.

Cheers


El 27/03/2019 a las 23:05, Noelia Ruiz via Groups.Io escribió:
Hi, your suggestion sounds good. I don't use numeric keys and use
laptop key with bloc uppercase as NVDA key, so I don't have problems
with combinations, but I agree with you.

Cheers


El 27/03/2019 a las 22:16, Robert Hänggi escribió:
Hi Noelia,
I would swap the two gestures. NVDA+Alt for the normal functionality
and NVDA+Shift+Control for the clipboard because pressing three keys
is hard enough, no need for 4. Especially if you want to perform a
double gesture.

I wonder, shouldn't the NumPad be used, if available, after all, the
gestures for object navigation are situated there.
And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
(I've seen the latter one in the dragAndDrop addOn but I don't know
if
that is still maintained).

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, another suggestion:
NVDA+control+shift are used, when pressing r, to object navigator,
and
when pressing p, for clipboard images. And alt+nvda, for navigator
object in case of image descriptor, and with r for clipboard. This
may
be inconsistent. Maybe better to use control+nvda+shift for
navigator
object and nvda+alt for clipboard?

Cheers


El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
Hi again, the add-on is on the website. Anyway, I recommend you to
format a bit the documentation, for example, you may want to put
The
first heading level 3 after level 1 to 2, and also fix lists,
since
after the log option the list finishes, etc.
I have put asterisks at start, so that compatibility info, author
and
download links appear in a list, as done with other add-ons on the
website.
Also, the link using addonFiles address is used, to avoid issues
in
translations.
When you declare the add-on as stable, we can register it to be
translated. Now translators can start with documentation.
https://addons.nvda-project.org/addons/onlineOCR.en.html

Cheers

https://addons.nvda-project.org/addons/onlineOCR.en.html
El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
Hi, basic review results after adding image describers:

- License and copyright: Pass.
- Security: pass.
- Documentation: Pass.
- User experience: pass with comments.

Image describers can provide text results (not in browse mode),
and
this has to be fixed. Also, when pressing the gesture twice, NVDA
can
report that there is active another recognizer.
Anyway, seing that author is listening and responding feed-back,
and
that Oliver work has been used and mentioned in documentation,
and
moreover the code shows a deep experience looking at NVDA
relevant
code, and also that the main structure is done, and fail refers
to
part of some features, and that this should be tested in
different
languages, I think we can post it in the development section of
website to share widely in different communities, since the
add-on is
useful now and will be improved.
Here is my log. Since no objections are expressed, I will post
the
add-on now, and this is the log for emage describer errors:

INFO -
external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(19:17:16.818):
(u'C:\\Users\\User\\Downloads\\ant.jpg',)
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}



El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
Yes, I have seen an error with text result in browse mode with
the
image describer plugin. Anyway, I think it has a lot of feedback
and
just it could be added to development section of website, since
the
major part is done, it has passed basic review, the quality in
general is good, and no major changes in documentation are going
to
happen for now. If you dont have objections, I will post this in
about an hour for wide testing, and if minor version are
released
they will be posted, but now translators can start to translate
documentation so that different communities can test the add-on
with
some quality and feedback, as a first filter.
Cheers""

Enviado desde mi iPhone

El 27 mar 2019, a las 17:43, Robert Hänggi
<aarjay.robert@gmail.com>
escribió:

Hi
I'm glad that the addOn makes such huge steps.

The browsable message doesn't seem to work for NVDA+Alt+P if I
swap
the gestures.

The description might need some formatting, at least some line
breaks.

That's the result with azure:

Categories: others_ outdoor_ text_sign
This image does not contain adult content This image contains
racy
content
Dominant foreground color is Red. Dominant background color is
White.
Hex code of accent color is 000000. Dominant colors: White The
image
is not black and white.
Tags: screenshot design pixel vector typography
The image is not a clip-art. The image is not a lineDrawing
Descriptions: a close up of a device

(Note there is no racy content, LOL. It's the record button of
Audacity.)

Empty tags could be omitted.
The accent colour could perhaps be represented as RGB or
expressed
with the NVDA colour descriptions.

Anyways, great work.

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, I will try to post the development version of this
wonderful
add-on on the website after job, in the evening, in about 2
hours or
so.
Personally, I feel a deep emotion for this. Of course, this
doesn't
replace a human description where we can ask questions and so
on.
I think this is an open door also for students in scientific
degrees,
where people can have a lot of problems and some of us have
listened
things like: "Why don't you try to drive a car?". Since some
things
can be identified by machine learning, or at least this can
help us,
and this is very important.
I have a lot of job and hope other reviewers can help with new
add-ons too.

Cheers


2019-03-27 16:15 GMT+01:00, Larry Wang
<larry.wang.801@gmail.com>:
Hi everyone,

Version 0.11 of online ocr addon is released.
Changes in this version include:
Change addon summary to online image describer
Added image description capability

NVDA+Alt+P Recognize current navigator object Then read
result. If
pressed
twice, open a virtual result document.

Control+Shift+NVDA+P Recognizes image in clipboard . Then
read
result. If
pressed twice, open a virtual result document.
Here are three engines available.

### Machine Learning Engine by Oliver Edholm
It's a free engine gives description of an image.
If there is text inside it will do OCR on the image.
There are two settings for this engine.
* Language of result:
English by default. If you configure another language than
English, the
description could have translation issues because it's
automatically
generated by machine translation service.

Security:
* The images are sent to a script hosted on the Google Cloud
Platform for
analysis. After the analysis the image gets removed from the
server and
will never be seen again.

The author of this addon has setup a proxy on www.nvdacn.com
for
users
who
cannot access google service access.
If you want to use this proxy please chose Use proxy on
www.nvdacn.com in
access type settings.

If you want to use your own key in the following two
Microsoft
engines.
Please follow the guide in Microsoft Azure OCR section.
### Microsoft Azure Image Analyser
This engine extracts a rich set of visual features based on
the
image
content.
This engine is english only by now.

Visual Features include:
Adult - detects if the image is pornographic in nature
(depicts
nudity or
a
sex act). Sexually suggestive content is also detected.
Brands - detects various brands within an image, including
the
approximate
location. The Brands argument is only available in English.
Categories - categorizes image content according to a
taxonomy
defined in
documentation.
Color - determines the accent color, dominant color, and
whether
an image
is black&white.
Description - describes the image content with a complete
sentence
in
supported languages.
Faces - detects if faces are present. If present, generate
coordinates,
gender and age.
ImageType - detects if image is clip art or a line drawing.
Objects - detects various objects within an image, including
the
approximate location. The Objects argument is only available
in
English.
Tags - tags the image with a detailed list of words related
to the
image
content.

Some features also provide additional details:

Celebrities - identifies celebrities if detected in the
image.
Landmarks - identifies landmarks if detected in the image.

### Microsoft Azure Image describer

This engine generates a description of an image in human
readable
language
with complete sentences. The description is based on a
collection of
content tags, which are also returned by the operation. More
than
one
description can be generated for each image. Descriptions are
ordered by
their confidence score.
There are two settings for this engine.
* Language
The language in which the service will return a description
of the
image.
English by default.

* Max Candidates
Maximum number of candidate descriptions to be returned. The
default is
1.


Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon






Cheers,
Larry



On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz
<nrm1977@gmail.com>
wrote:

In case it's useful, more suggestion:
- In the documentation presented in input mode for
NVDA+shift+control+r
announces that recognizes the content and opens a virtual
result,
but
this is configurable. This may be mentioned indicating that
the
result
could be spoken and braillified or presented in a virtual
document.
- Regarding the code, I have seen that sometimes it's used
something
like:
if something: return
else...
But I think in some cases else after return is not needed,
since
if the
previous condition exists, the script return and no more can
happen. I
made this mistake years ago and Mesar Hameed, a great person
and
developer, who created initially the scripts used for the
translation
system and make add-ons translatable by NVDA's translators
exchanging
messages between add-ons stored on Bitbucket team account
and the
translators repository, also first creator of guidelines for
reviewers
and authors and so on, fixed me teaching this mistake made
by me,
in
case you are interested.
When you want, for example when image descriptions are
added, we
can
post the add-on on the website, and if this is joined with
ImageDescriber, we will fix things later. I say this since
maybe
complicated to join these add-ons, imo, at least currently,
since
the
two interfaces are very different. Anyway, if you want to
wait to
post
the add-on, this is OK.


Cheers

El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io
escribió:
Sorry, another thought: I think that Larry could accept
pull
request
if the features contained in ImageDescribed are properly
integrated in
online_ocr. For example, I don't know a good idea just to
join
the two
add-ons, and I would understand that for any reason this
could
be done
and they remain as independent add-ons. Please don't feel
any
pressure
for our part (mentioning Robert and me).
Just for clarify

2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi, this is great news. I think that Larry may accept pull
requests,
since the add-on gui seems to be flexible, accepting
profiles,
integrated in NVDA settings, and accepting content recog
messages,
etc. Also, I think that creating pull request in this
add-on
can be
easy, using the abstract class for plugins. So, imo, if
both you
agree, I would vote for adding just one joined add-on on
the
website.
ImageDescriber could be a better name since it's more
generic,
though
I definitely would use Larry's UI and framework. What do
you
think?

Cheers and thanks both for your work.

2019-03-22 11:25 GMT+01:00, Robert Hänggi
<aarjay.robert@gmail.com>:
Oliver, I'm very glad that you're open to collaboration.
I think you're joined efforts would lead to a great
add-on.
Apart from that, it may help to avoid clones of 3rd party
modules
present in both add-ons and thus reducing storage and
load time.

I could perhaps also help with mouse related stuff (e.g.
recognition
of unknown pointer shapes automatically).

Cheers
Robert

On 22/03/2019, Oliver Edholm <oliver.edholm@gmail.com>
wrote:
Hi! It's the creator of the Image Describer add-on.
Looked a
little
bit
at
the code and it looks very well made.

I agree that we do similar things here. Yesterday I
added OCR
to
the
Image
Describer backend for example (without knowing about
this
add-on).

I'm also working on a free Captcha Solver which seems to
be
something
Larry
has also looked into.

Also regarding the example of utilizing more methods
from AI to
these
problems it's also something I've been working on. You
gave the
example
of
icon classification, I trained a classifier for this ~2
weeks
ago
which
intend to eventually add to the Image Describer add-on.

I'm definitely open for some sort of collaboration here.

On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:


Wouldn't AI conflict with the objectDescriber add-on?
I mean, two add-ons with essentially the same
functionality
is a
bit
strange.

However, I see certainly positive results when AI is
integrated.
Imagine a inaccessible application. A lot of times, it
does
not
only
use text but also symbols and it would be great if
general
once
could
be recognized.
- disc icons
- cog wheel/ wrench icons
- transport controls: play/fast forward/rewind icons
- home icon
- triangles (for drop down menus)
- check boxes, radio buttons, folder icons (= browse
for
folder)
and
so
on.

Additionally, simple rectangles which could indicate
form
fields
and
edit boxes. Compare the auto form detection used in
Acrobat
Pro DC

(By the way, I make use (in my version of golden
cursor)
changing
mouse icons to determine e.g. the boundaries of an edit
box or
scroll
bars or sliders.)

Another thing that I would like to see:
If one presses enter in a recognition result, the
default
action
is
performed. So far so good but here are some possible
improvements:
Case 1, when the recognition result stays open.
- perform a new recognition and update the document to
reflect
possible
changes.
Case 2, when the navigator object changes and the
document
closes.
- if the new navigator object is not a known class
(e.g.
window or
system client or generic, without children) do as for
case 1.

Best
Robert



On 21/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:

control nvda shift i is assigned to readfeeds add-on,
but
feel
free
to
assign it since readfeeds is an add-on, maintained and
initially
creadted
by
me (smile).

Enviado desde mi iPhone


El 21 mar 2019, a las 8:17, Larry Wang
<larry.wang.801@gmail.com>
escribió:

AI recognition is not hard to add since it does
similar
operation
on
an
image, I have found several engines too. But we need
another two
gestures,

is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I
unassigned?


On 2019/3/21 12:53, Noelia Ruiz wrote:
OK, let's wait some hours, and tonight or tomorrow I
will
try
to
post
the
add-on on the website if all is OK. For now I will
upload
the
dev
version
with one link, and when you declare the add-on as
stable we
will
post
the
link to stable version too and will register the
add-on in
translation
system.
I would like if you coul add AI recognition of
images, for
instance
if
they contain a man, a room, blond or brown hair,
etc.
This is a wonderful work!

El 21/03/2019 a las 1:47, Larry Wang escribió:
Hi everyone,

I am happy to release version 0.10 of online ocr
addon.
Changes in this version include

Fix error using user's own api key in sougou API

Fix unknown panel in sougou API settings

Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon





Cheers,
Larry








Re: Online OCR addon #addonrequestreview

Noelia Ruiz
 

Also, the swap setting seems not work for image describers, just for OCR.
This is a bug when recognizing the Google Logo:

ERROR - stderr (13:52:14.867):
Exception in thread Thread-8:
Traceback (most recent call last):
File "threading.pyo", line 801, in __bootstrap_inner
File "threading.pyo", line 754, in run
File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\winHttp.py",
line 115, in postContent
File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\winHttp.py",
line 109, in doHTTPRequest
File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\onlineOCRHandler.py",
line 514, in callback
File "ui.pyo", line 67, in message
File "braille.pyo", line 1795, in message
File "braille.pyo", line 1807, in _resetMessageTimer
File "wx\core.pyo", line 3305, in Start
wxAssertionError: C++ assertion "wxThread::IsMain()" failed at
..\..\src\common\timerimpl.cpp(60) in wxTimerImpl::Start(): timer can
only be started from the main thread


2019-03-28 13:35 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:

Hi Larry, I have given you write access to addonFiles repo. Please
clone it, see get.php files and the entry for your add-on, identified
as oid
There you can replace the redirected link to the updated URL.
After job I will update documentation on the website if you can't.
Clone at
git clone https://bitbucket.org/nvdaaddonteam/addonfiles.git
Cheers


2019-03-28 11:20 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi Noelia,

If Robert's issue is fixed, I think it is better to post 0.13.

I have a bitbucket account and I am willing to post versions by my self.

Actually this addon was on a private repo on bitbucket before I put it
on github.

This is my profile page.

https://bitbucket.org/Rheinmetal/

On 2019/3/28 17:37, Noelia Ruiz wrote:
Hi, let us know if we should post 0.12 or 0.13 with fixed profile issue.
Also, for add-ons reviewed by me, if authors show Git experience, I
try to offer write access to addonFiles repo if they have a Bitbucket
account, so authors can update versions themselves on the website, if
they want. Otherwise, we can do it.
So, if you have a Bitbucket account and want write access to
addonFiles, let us know. Just be patient, since Bitbucket is not very
comfortable and provide this access can be hard and, at least in my
case, sometimes I need to retry it several times.

Cheers


2019-03-28 9:51 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi Noelia

1. Documentation problem mentioned above is corrected in 0.12

2. Access type is valid for Oliver's engine. That is because I provided
a proxy for users without access to google, by default it connects
directly to google .
3. I will look into how to use a checkbox. The checkbox in advanced
panel may be a good candidate.


On 2019/3/28 15:10, Noelia Ruiz wrote:
Also, the plugins based on Oliver's work seems not to need an api key.
Then I think that the combo box should be disabled, since the combo
box label refers to api key, but options correspond to proxy.
About the details supported by plugins like Azure Analizer, I think
that a list of checkbox may be better than a combo box.
Also, I saw that in some part imageInfo appears as not defined. This
is normal, since the add-on is very complex and have a lot of plugins.
Anyway, we can update the version and formatted documentation easily.

Cheers


El 27/03/2019 a las 23:05, Noelia Ruiz via Groups.Io escribió:
Hi, your suggestion sounds good. I don't use numeric keys and use
laptop key with bloc uppercase as NVDA key, so I don't have problems
with combinations, but I agree with you.

Cheers


El 27/03/2019 a las 22:16, Robert Hänggi escribió:
Hi Noelia,
I would swap the two gestures. NVDA+Alt for the normal functionality
and NVDA+Shift+Control for the clipboard because pressing three keys
is hard enough, no need for 4. Especially if you want to perform a
double gesture.

I wonder, shouldn't the NumPad be used, if available, after all, the
gestures for object navigation are situated there.
And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
(I've seen the latter one in the dragAndDrop addOn but I don't know
if
that is still maintained).

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, another suggestion:
NVDA+control+shift are used, when pressing r, to object navigator,
and
when pressing p, for clipboard images. And alt+nvda, for navigator
object in case of image descriptor, and with r for clipboard. This
may
be inconsistent. Maybe better to use control+nvda+shift for
navigator
object and nvda+alt for clipboard?

Cheers


El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
Hi again, the add-on is on the website. Anyway, I recommend you to
format a bit the documentation, for example, you may want to put
The
first heading level 3 after level 1 to 2, and also fix lists,
since
after the log option the list finishes, etc.
I have put asterisks at start, so that compatibility info, author
and
download links appear in a list, as done with other add-ons on the
website.
Also, the link using addonFiles address is used, to avoid issues
in
translations.
When you declare the add-on as stable, we can register it to be
translated. Now translators can start with documentation.
https://addons.nvda-project.org/addons/onlineOCR.en.html

Cheers

https://addons.nvda-project.org/addons/onlineOCR.en.html
El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
Hi, basic review results after adding image describers:

- License and copyright: Pass.
- Security: pass.
- Documentation: Pass.
- User experience: pass with comments.

Image describers can provide text results (not in browse mode),
and
this has to be fixed. Also, when pressing the gesture twice, NVDA
can
report that there is active another recognizer.
Anyway, seing that author is listening and responding feed-back,
and
that Oliver work has been used and mentioned in documentation,
and
moreover the code shows a deep experience looking at NVDA
relevant
code, and also that the main structure is done, and fail refers
to
part of some features, and that this should be tested in
different
languages, I think we can post it in the development section of
website to share widely in different communities, since the
add-on is
useful now and will be improved.
Here is my log. Since no objections are expressed, I will post
the
add-on now, and this is the log for emage describer errors:

INFO -
external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(19:17:16.818):
(u'C:\\Users\\User\\Downloads\\ant.jpg',)
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}



El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
Yes, I have seen an error with text result in browse mode with
the
image describer plugin. Anyway, I think it has a lot of feedback
and
just it could be added to development section of website, since
the
major part is done, it has passed basic review, the quality in
general is good, and no major changes in documentation are going
to
happen for now. If you dont have objections, I will post this in
about an hour for wide testing, and if minor version are
released
they will be posted, but now translators can start to translate
documentation so that different communities can test the add-on
with
some quality and feedback, as a first filter.
Cheers""

Enviado desde mi iPhone

El 27 mar 2019, a las 17:43, Robert Hänggi
<aarjay.robert@gmail.com>
escribió:

Hi
I'm glad that the addOn makes such huge steps.

The browsable message doesn't seem to work for NVDA+Alt+P if I
swap
the gestures.

The description might need some formatting, at least some line
breaks.

That's the result with azure:

Categories: others_ outdoor_ text_sign
This image does not contain adult content This image contains
racy
content
Dominant foreground color is Red. Dominant background color is
White.
Hex code of accent color is 000000. Dominant colors: White The
image
is not black and white.
Tags: screenshot design pixel vector typography
The image is not a clip-art. The image is not a lineDrawing
Descriptions: a close up of a device

(Note there is no racy content, LOL. It's the record button of
Audacity.)

Empty tags could be omitted.
The accent colour could perhaps be represented as RGB or
expressed
with the NVDA colour descriptions.

Anyways, great work.

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, I will try to post the development version of this
wonderful
add-on on the website after job, in the evening, in about 2
hours or
so.
Personally, I feel a deep emotion for this. Of course, this
doesn't
replace a human description where we can ask questions and so
on.
I think this is an open door also for students in scientific
degrees,
where people can have a lot of problems and some of us have
listened
things like: "Why don't you try to drive a car?". Since some
things
can be identified by machine learning, or at least this can
help us,
and this is very important.
I have a lot of job and hope other reviewers can help with new
add-ons too.

Cheers


2019-03-27 16:15 GMT+01:00, Larry Wang
<larry.wang.801@gmail.com>:
Hi everyone,

Version 0.11 of online ocr addon is released.
Changes in this version include:
Change addon summary to online image describer
Added image description capability

NVDA+Alt+P Recognize current navigator object Then read
result. If
pressed
twice, open a virtual result document.

Control+Shift+NVDA+P Recognizes image in clipboard . Then
read
result. If
pressed twice, open a virtual result document.
Here are three engines available.

### Machine Learning Engine by Oliver Edholm
It's a free engine gives description of an image.
If there is text inside it will do OCR on the image.
There are two settings for this engine.
* Language of result:
English by default. If you configure another language than
English, the
description could have translation issues because it's
automatically
generated by machine translation service.

Security:
* The images are sent to a script hosted on the Google Cloud
Platform for
analysis. After the analysis the image gets removed from the
server and
will never be seen again.

The author of this addon has setup a proxy on www.nvdacn.com
for
users
who
cannot access google service access.
If you want to use this proxy please chose Use proxy on
www.nvdacn.com in
access type settings.

If you want to use your own key in the following two
Microsoft
engines.
Please follow the guide in Microsoft Azure OCR section.
### Microsoft Azure Image Analyser
This engine extracts a rich set of visual features based on
the
image
content.
This engine is english only by now.

Visual Features include:
Adult - detects if the image is pornographic in nature
(depicts
nudity or
a
sex act). Sexually suggestive content is also detected.
Brands - detects various brands within an image, including
the
approximate
location. The Brands argument is only available in English.
Categories - categorizes image content according to a
taxonomy
defined in
documentation.
Color - determines the accent color, dominant color, and
whether
an image
is black&white.
Description - describes the image content with a complete
sentence
in
supported languages.
Faces - detects if faces are present. If present, generate
coordinates,
gender and age.
ImageType - detects if image is clip art or a line drawing.
Objects - detects various objects within an image, including
the
approximate location. The Objects argument is only available
in
English.
Tags - tags the image with a detailed list of words related
to the
image
content.

Some features also provide additional details:

Celebrities - identifies celebrities if detected in the
image.
Landmarks - identifies landmarks if detected in the image.

### Microsoft Azure Image describer

This engine generates a description of an image in human
readable
language
with complete sentences. The description is based on a
collection of
content tags, which are also returned by the operation. More
than
one
description can be generated for each image. Descriptions are
ordered by
their confidence score.
There are two settings for this engine.
* Language
The language in which the service will return a description
of the
image.
English by default.

* Max Candidates
Maximum number of candidate descriptions to be returned. The
default is
1.


Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon






Cheers,
Larry



On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz
<nrm1977@gmail.com>
wrote:

In case it's useful, more suggestion:
- In the documentation presented in input mode for
NVDA+shift+control+r
announces that recognizes the content and opens a virtual
result,
but
this is configurable. This may be mentioned indicating that
the
result
could be spoken and braillified or presented in a virtual
document.
- Regarding the code, I have seen that sometimes it's used
something
like:
if something: return
else...
But I think in some cases else after return is not needed,
since
if the
previous condition exists, the script return and no more can
happen. I
made this mistake years ago and Mesar Hameed, a great person
and
developer, who created initially the scripts used for the
translation
system and make add-ons translatable by NVDA's translators
exchanging
messages between add-ons stored on Bitbucket team account
and the
translators repository, also first creator of guidelines for
reviewers
and authors and so on, fixed me teaching this mistake made
by me,
in
case you are interested.
When you want, for example when image descriptions are
added, we
can
post the add-on on the website, and if this is joined with
ImageDescriber, we will fix things later. I say this since
maybe
complicated to join these add-ons, imo, at least currently,
since
the
two interfaces are very different. Anyway, if you want to
wait to
post
the add-on, this is OK.


Cheers

El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io
escribió:
Sorry, another thought: I think that Larry could accept
pull
request
if the features contained in ImageDescribed are properly
integrated in
online_ocr. For example, I don't know a good idea just to
join
the two
add-ons, and I would understand that for any reason this
could
be done
and they remain as independent add-ons. Please don't feel
any
pressure
for our part (mentioning Robert and me).
Just for clarify

2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi, this is great news. I think that Larry may accept pull
requests,
since the add-on gui seems to be flexible, accepting
profiles,
integrated in NVDA settings, and accepting content recog
messages,
etc. Also, I think that creating pull request in this
add-on
can be
easy, using the abstract class for plugins. So, imo, if
both you
agree, I would vote for adding just one joined add-on on
the
website.
ImageDescriber could be a better name since it's more
generic,
though
I definitely would use Larry's UI and framework. What do
you
think?

Cheers and thanks both for your work.

2019-03-22 11:25 GMT+01:00, Robert Hänggi
<aarjay.robert@gmail.com>:
Oliver, I'm very glad that you're open to collaboration.
I think you're joined efforts would lead to a great
add-on.
Apart from that, it may help to avoid clones of 3rd party
modules
present in both add-ons and thus reducing storage and
load time.

I could perhaps also help with mouse related stuff (e.g.
recognition
of unknown pointer shapes automatically).

Cheers
Robert

On 22/03/2019, Oliver Edholm <oliver.edholm@gmail.com>
wrote:
Hi! It's the creator of the Image Describer add-on.
Looked a
little
bit
at
the code and it looks very well made.

I agree that we do similar things here. Yesterday I
added OCR
to
the
Image
Describer backend for example (without knowing about
this
add-on).

I'm also working on a free Captcha Solver which seems to
be
something
Larry
has also looked into.

Also regarding the example of utilizing more methods
from AI to
these
problems it's also something I've been working on. You
gave the
example
of
icon classification, I trained a classifier for this ~2
weeks
ago
which
intend to eventually add to the Image Describer add-on.

I'm definitely open for some sort of collaboration here.

On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:


Wouldn't AI conflict with the objectDescriber add-on?
I mean, two add-ons with essentially the same
functionality
is a
bit
strange.

However, I see certainly positive results when AI is
integrated.
Imagine a inaccessible application. A lot of times, it
does
not
only
use text but also symbols and it would be great if
general
once
could
be recognized.
- disc icons
- cog wheel/ wrench icons
- transport controls: play/fast forward/rewind icons
- home icon
- triangles (for drop down menus)
- check boxes, radio buttons, folder icons (= browse
for
folder)
and
so
on.

Additionally, simple rectangles which could indicate
form
fields
and
edit boxes. Compare the auto form detection used in
Acrobat
Pro DC

(By the way, I make use (in my version of golden
cursor)
changing
mouse icons to determine e.g. the boundaries of an edit
box or
scroll
bars or sliders.)

Another thing that I would like to see:
If one presses enter in a recognition result, the
default
action
is
performed. So far so good but here are some possible
improvements:
Case 1, when the recognition result stays open.
- perform a new recognition and update the document to
reflect
possible
changes.
Case 2, when the navigator object changes and the
document
closes.
- if the new navigator object is not a known class
(e.g.
window or
system client or generic, without children) do as for
case 1.

Best
Robert



On 21/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:

control nvda shift i is assigned to readfeeds add-on,
but
feel
free
to
assign it since readfeeds is an add-on, maintained and
initially
creadted
by
me (smile).

Enviado desde mi iPhone


El 21 mar 2019, a las 8:17, Larry Wang
<larry.wang.801@gmail.com>
escribió:

AI recognition is not hard to add since it does
similar
operation
on
an
image, I have found several engines too. But we need
another two
gestures,

is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I
unassigned?


On 2019/3/21 12:53, Noelia Ruiz wrote:
OK, let's wait some hours, and tonight or tomorrow I
will
try
to
post
the
add-on on the website if all is OK. For now I will
upload
the
dev
version
with one link, and when you declare the add-on as
stable we
will
post
the
link to stable version too and will register the
add-on in
translation
system.
I would like if you coul add AI recognition of
images, for
instance
if
they contain a man, a room, blond or brown hair,
etc.
This is a wonderful work!

El 21/03/2019 a las 1:47, Larry Wang escribió:
Hi everyone,

I am happy to release version 0.10 of online ocr
addon.
Changes in this version include

Fix error using user's own api key in sougou API

Fix unknown panel in sougou API settings

Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon






Cheers,
Larry



















Re: Online OCR addon #addonrequestreview

Noelia Ruiz
 

Hi Larry, I have given you write access to addonFiles repo. Please
clone it, see get.php files and the entry for your add-on, identified
as oid
There you can replace the redirected link to the updated URL.
After job I will update documentation on the website if you can't.
Clone at
git clone https://bitbucket.org/nvdaaddonteam/addonfiles.git
Cheers


2019-03-28 11:20 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:

Hi Noelia,

If Robert's issue is fixed, I think it is better to post 0.13.

I have a bitbucket account and I am willing to post versions by my self.

Actually this addon was on a private repo on bitbucket before I put it
on github.

This is my profile page.

https://bitbucket.org/Rheinmetal/

On 2019/3/28 17:37, Noelia Ruiz wrote:
Hi, let us know if we should post 0.12 or 0.13 with fixed profile issue.
Also, for add-ons reviewed by me, if authors show Git experience, I
try to offer write access to addonFiles repo if they have a Bitbucket
account, so authors can update versions themselves on the website, if
they want. Otherwise, we can do it.
So, if you have a Bitbucket account and want write access to
addonFiles, let us know. Just be patient, since Bitbucket is not very
comfortable and provide this access can be hard and, at least in my
case, sometimes I need to retry it several times.

Cheers


2019-03-28 9:51 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi Noelia

1. Documentation problem mentioned above is corrected in 0.12

2. Access type is valid for Oliver's engine. That is because I provided
a proxy for users without access to google, by default it connects
directly to google .
3. I will look into how to use a checkbox. The checkbox in advanced
panel may be a good candidate.


On 2019/3/28 15:10, Noelia Ruiz wrote:
Also, the plugins based on Oliver's work seems not to need an api key.
Then I think that the combo box should be disabled, since the combo
box label refers to api key, but options correspond to proxy.
About the details supported by plugins like Azure Analizer, I think
that a list of checkbox may be better than a combo box.
Also, I saw that in some part imageInfo appears as not defined. This
is normal, since the add-on is very complex and have a lot of plugins.
Anyway, we can update the version and formatted documentation easily.

Cheers


El 27/03/2019 a las 23:05, Noelia Ruiz via Groups.Io escribió:
Hi, your suggestion sounds good. I don't use numeric keys and use
laptop key with bloc uppercase as NVDA key, so I don't have problems
with combinations, but I agree with you.

Cheers


El 27/03/2019 a las 22:16, Robert Hänggi escribió:
Hi Noelia,
I would swap the two gestures. NVDA+Alt for the normal functionality
and NVDA+Shift+Control for the clipboard because pressing three keys
is hard enough, no need for 4. Especially if you want to perform a
double gesture.

I wonder, shouldn't the NumPad be used, if available, after all, the
gestures for object navigation are situated there.
And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
(I've seen the latter one in the dragAndDrop addOn but I don't know
if
that is still maintained).

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, another suggestion:
NVDA+control+shift are used, when pressing r, to object navigator,
and
when pressing p, for clipboard images. And alt+nvda, for navigator
object in case of image descriptor, and with r for clipboard. This
may
be inconsistent. Maybe better to use control+nvda+shift for
navigator
object and nvda+alt for clipboard?

Cheers


El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
Hi again, the add-on is on the website. Anyway, I recommend you to
format a bit the documentation, for example, you may want to put
The
first heading level 3 after level 1 to 2, and also fix lists, since
after the log option the list finishes, etc.
I have put asterisks at start, so that compatibility info, author
and
download links appear in a list, as done with other add-ons on the
website.
Also, the link using addonFiles address is used, to avoid issues in
translations.
When you declare the add-on as stable, we can register it to be
translated. Now translators can start with documentation.
https://addons.nvda-project.org/addons/onlineOCR.en.html

Cheers

https://addons.nvda-project.org/addons/onlineOCR.en.html
El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
Hi, basic review results after adding image describers:

- License and copyright: Pass.
- Security: pass.
- Documentation: Pass.
- User experience: pass with comments.

Image describers can provide text results (not in browse mode),
and
this has to be fixed. Also, when pressing the gesture twice, NVDA
can
report that there is active another recognizer.
Anyway, seing that author is listening and responding feed-back,
and
that Oliver work has been used and mentioned in documentation, and
moreover the code shows a deep experience looking at NVDA relevant
code, and also that the main structure is done, and fail refers to
part of some features, and that this should be tested in different
languages, I think we can post it in the development section of
website to share widely in different communities, since the
add-on is
useful now and will be improved.
Here is my log. Since no objections are expressed, I will post the
add-on now, and this is the log for emage describer errors:

INFO -
external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(19:17:16.818):
(u'C:\\Users\\User\\Downloads\\ant.jpg',)
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}



El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
Yes, I have seen an error with text result in browse mode with
the
image describer plugin. Anyway, I think it has a lot of feedback
and
just it could be added to development section of website, since
the
major part is done, it has passed basic review, the quality in
general is good, and no major changes in documentation are going
to
happen for now. If you dont have objections, I will post this in
about an hour for wide testing, and if minor version are released
they will be posted, but now translators can start to translate
documentation so that different communities can test the add-on
with
some quality and feedback, as a first filter.
Cheers""

Enviado desde mi iPhone

El 27 mar 2019, a las 17:43, Robert Hänggi
<aarjay.robert@gmail.com>
escribió:

Hi
I'm glad that the addOn makes such huge steps.

The browsable message doesn't seem to work for NVDA+Alt+P if I
swap
the gestures.

The description might need some formatting, at least some line
breaks.

That's the result with azure:

Categories: others_ outdoor_ text_sign
This image does not contain adult content This image contains
racy
content
Dominant foreground color is Red. Dominant background color is
White.
Hex code of accent color is 000000. Dominant colors: White The
image
is not black and white.
Tags: screenshot design pixel vector typography
The image is not a clip-art. The image is not a lineDrawing
Descriptions: a close up of a device

(Note there is no racy content, LOL. It's the record button of
Audacity.)

Empty tags could be omitted.
The accent colour could perhaps be represented as RGB or
expressed
with the NVDA colour descriptions.

Anyways, great work.

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, I will try to post the development version of this
wonderful
add-on on the website after job, in the evening, in about 2
hours or
so.
Personally, I feel a deep emotion for this. Of course, this
doesn't
replace a human description where we can ask questions and so
on.
I think this is an open door also for students in scientific
degrees,
where people can have a lot of problems and some of us have
listened
things like: "Why don't you try to drive a car?". Since some
things
can be identified by machine learning, or at least this can
help us,
and this is very important.
I have a lot of job and hope other reviewers can help with new
add-ons too.

Cheers


2019-03-27 16:15 GMT+01:00, Larry Wang
<larry.wang.801@gmail.com>:
Hi everyone,

Version 0.11 of online ocr addon is released.
Changes in this version include:
Change addon summary to online image describer
Added image description capability

NVDA+Alt+P Recognize current navigator object Then read
result. If
pressed
twice, open a virtual result document.

Control+Shift+NVDA+P Recognizes image in clipboard . Then read
result. If
pressed twice, open a virtual result document.
Here are three engines available.

### Machine Learning Engine by Oliver Edholm
It's a free engine gives description of an image.
If there is text inside it will do OCR on the image.
There are two settings for this engine.
* Language of result:
English by default. If you configure another language than
English, the
description could have translation issues because it's
automatically
generated by machine translation service.

Security:
* The images are sent to a script hosted on the Google Cloud
Platform for
analysis. After the analysis the image gets removed from the
server and
will never be seen again.

The author of this addon has setup a proxy on www.nvdacn.com
for
users
who
cannot access google service access.
If you want to use this proxy please chose Use proxy on
www.nvdacn.com in
access type settings.

If you want to use your own key in the following two Microsoft
engines.
Please follow the guide in Microsoft Azure OCR section.
### Microsoft Azure Image Analyser
This engine extracts a rich set of visual features based on
the
image
content.
This engine is english only by now.

Visual Features include:
Adult - detects if the image is pornographic in nature
(depicts
nudity or
a
sex act). Sexually suggestive content is also detected.
Brands - detects various brands within an image, including the
approximate
location. The Brands argument is only available in English.
Categories - categorizes image content according to a taxonomy
defined in
documentation.
Color - determines the accent color, dominant color, and
whether
an image
is black&white.
Description - describes the image content with a complete
sentence
in
supported languages.
Faces - detects if faces are present. If present, generate
coordinates,
gender and age.
ImageType - detects if image is clip art or a line drawing.
Objects - detects various objects within an image, including
the
approximate location. The Objects argument is only available
in
English.
Tags - tags the image with a detailed list of words related
to the
image
content.

Some features also provide additional details:

Celebrities - identifies celebrities if detected in the image.
Landmarks - identifies landmarks if detected in the image.

### Microsoft Azure Image describer

This engine generates a description of an image in human
readable
language
with complete sentences. The description is based on a
collection of
content tags, which are also returned by the operation. More
than
one
description can be generated for each image. Descriptions are
ordered by
their confidence score.
There are two settings for this engine.
* Language
The language in which the service will return a description
of the
image.
English by default.

* Max Candidates
Maximum number of candidate descriptions to be returned. The
default is
1.


Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon






Cheers,
Larry



On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz
<nrm1977@gmail.com>
wrote:

In case it's useful, more suggestion:
- In the documentation presented in input mode for
NVDA+shift+control+r
announces that recognizes the content and opens a virtual
result,
but
this is configurable. This may be mentioned indicating that
the
result
could be spoken and braillified or presented in a virtual
document.
- Regarding the code, I have seen that sometimes it's used
something
like:
if something: return
else...
But I think in some cases else after return is not needed,
since
if the
previous condition exists, the script return and no more can
happen. I
made this mistake years ago and Mesar Hameed, a great person
and
developer, who created initially the scripts used for the
translation
system and make add-ons translatable by NVDA's translators
exchanging
messages between add-ons stored on Bitbucket team account
and the
translators repository, also first creator of guidelines for
reviewers
and authors and so on, fixed me teaching this mistake made
by me,
in
case you are interested.
When you want, for example when image descriptions are
added, we
can
post the add-on on the website, and if this is joined with
ImageDescriber, we will fix things later. I say this since
maybe
complicated to join these add-ons, imo, at least currently,
since
the
two interfaces are very different. Anyway, if you want to
wait to
post
the add-on, this is OK.


Cheers

El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io
escribió:
Sorry, another thought: I think that Larry could accept pull
request
if the features contained in ImageDescribed are properly
integrated in
online_ocr. For example, I don't know a good idea just to
join
the two
add-ons, and I would understand that for any reason this
could
be done
and they remain as independent add-ons. Please don't feel
any
pressure
for our part (mentioning Robert and me).
Just for clarify

2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi, this is great news. I think that Larry may accept pull
requests,
since the add-on gui seems to be flexible, accepting
profiles,
integrated in NVDA settings, and accepting content recog
messages,
etc. Also, I think that creating pull request in this
add-on
can be
easy, using the abstract class for plugins. So, imo, if
both you
agree, I would vote for adding just one joined add-on on
the
website.
ImageDescriber could be a better name since it's more
generic,
though
I definitely would use Larry's UI and framework. What do
you
think?

Cheers and thanks both for your work.

2019-03-22 11:25 GMT+01:00, Robert Hänggi
<aarjay.robert@gmail.com>:
Oliver, I'm very glad that you're open to collaboration.
I think you're joined efforts would lead to a great
add-on.
Apart from that, it may help to avoid clones of 3rd party
modules
present in both add-ons and thus reducing storage and
load time.

I could perhaps also help with mouse related stuff (e.g.
recognition
of unknown pointer shapes automatically).

Cheers
Robert

On 22/03/2019, Oliver Edholm <oliver.edholm@gmail.com>
wrote:
Hi! It's the creator of the Image Describer add-on.
Looked a
little
bit
at
the code and it looks very well made.

I agree that we do similar things here. Yesterday I
added OCR
to
the
Image
Describer backend for example (without knowing about this
add-on).

I'm also working on a free Captcha Solver which seems to
be
something
Larry
has also looked into.

Also regarding the example of utilizing more methods
from AI to
these
problems it's also something I've been working on. You
gave the
example
of
icon classification, I trained a classifier for this ~2
weeks
ago
which
intend to eventually add to the Image Describer add-on.

I'm definitely open for some sort of collaboration here.

On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:


Wouldn't AI conflict with the objectDescriber add-on?
I mean, two add-ons with essentially the same
functionality
is a
bit
strange.

However, I see certainly positive results when AI is
integrated.
Imagine a inaccessible application. A lot of times, it
does
not
only
use text but also symbols and it would be great if
general
once
could
be recognized.
- disc icons
- cog wheel/ wrench icons
- transport controls: play/fast forward/rewind icons
- home icon
- triangles (for drop down menus)
- check boxes, radio buttons, folder icons (= browse for
folder)
and
so
on.

Additionally, simple rectangles which could indicate
form
fields
and
edit boxes. Compare the auto form detection used in
Acrobat
Pro DC

(By the way, I make use (in my version of golden cursor)
changing
mouse icons to determine e.g. the boundaries of an edit
box or
scroll
bars or sliders.)

Another thing that I would like to see:
If one presses enter in a recognition result, the
default
action
is
performed. So far so good but here are some possible
improvements:
Case 1, when the recognition result stays open.
- perform a new recognition and update the document to
reflect
possible
changes.
Case 2, when the navigator object changes and the
document
closes.
- if the new navigator object is not a known class (e.g.
window or
system client or generic, without children) do as for
case 1.

Best
Robert



On 21/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:

control nvda shift i is assigned to readfeeds add-on,
but
feel
free
to
assign it since readfeeds is an add-on, maintained and
initially
creadted
by
me (smile).

Enviado desde mi iPhone


El 21 mar 2019, a las 8:17, Larry Wang
<larry.wang.801@gmail.com>
escribió:

AI recognition is not hard to add since it does
similar
operation
on
an
image, I have found several engines too. But we need
another two
gestures,

is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I unassigned?


On 2019/3/21 12:53, Noelia Ruiz wrote:
OK, let's wait some hours, and tonight or tomorrow I
will
try
to
post
the
add-on on the website if all is OK. For now I will
upload
the
dev
version
with one link, and when you declare the add-on as
stable we
will
post
the
link to stable version too and will register the
add-on in
translation
system.
I would like if you coul add AI recognition of
images, for
instance
if
they contain a man, a room, blond or brown hair, etc.
This is a wonderful work!

El 21/03/2019 a las 1:47, Larry Wang escribió:
Hi everyone,

I am happy to release version 0.10 of online ocr
addon.
Changes in this version include

Fix error using user's own api key in sougou API

Fix unknown panel in sougou API settings

Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon






Cheers,
Larry

















Re : Re: [nvda-addons] Online OCR addon #addonrequestreview

Cyrille
 

Hello Larry
Regarding point 1, you may catch the exception, inform the user that recognition failed and then re-raise the caught exception.
regards,
Cyrille
----- Mail d'origine -----
De: Larry Wang <larry.wang.801@...>
À: nvda-addons@nvda-addons.groups.io
Envoyé: Thu, 28 Mar 2019 10:31:51 +0100 (CET)
Objet: Re: [nvda-addons] Online OCR addon #addonrequestreview
Hi Robert,
Thanks for your advise.
1. There is another recognition ongoing may be related with missing
clean up after timeout.
2. My except clause in recognition function is too board, failing early
really help a lot with debugging. The reason why I catch them all was
that if user run this addon on stable NVDA, there is no error sound when
exceptions occur. Thus would make people think the addon does not respond.
3. The messages were split for reuse at first. But after refactoring
engine code I find that your suggestion would make it easier to
maintain  most of the messages.  Only a few messages can be reused such
as no image message in __init__.py
4. As for gestures , I have fixed inconsistency you mentioned. NVDA+Alt
for object navigation and Control+NVDA+Shift for clipboard. Repeating
gestures may not be too hard to do. Just hold down control alt and NVDA
then press p or r twice. May be you are trying to press four keys twice
at once?
I mapped NVDA+numpad3 to "Moves to the previous object in a flattened
view of the object navigation hierarchy" NVDA+numpad9 to next. So I did
not think of using numpad.
5. version 0.13 fixed the config switch issue

On 2019/3/28 15:39, Robert Hänggi wrote:
> Hi
>
> There are plenty of errors going on and the messages aren't that
> helpful, i.e. they are sometimes not related to the exception that
> caused them.
> I wouldn't generalize too much.
> 1. put the messages where they belong, not e.g. "ui.message(failed_message)
> 2. treat different errors in separate except clauses with the
> appropriate message. Also, keep the try clause brief.
> Example for both:
>
> try:
> ocrResult = self.extract_text(result)
> if ocrResult.isspace():
> # Translators: Reported when recognition result is empty
> ocrResult = _(u"blank. There may be no text on this image.")
> resultText = result_prefix + ocrResult
> if config.conf[self.configSectionName]["copyToClipboard"]:
> import api
> api.copyToClip(resultText)
> if self.text_result:
> ui.message(resultText)
> else:
> self._onResult(LinesWordsResult(self.convert_to_line_result_format(result),
> imageInfo))
> except Exception as e:
> log.error(e)
> log.error(result)
> ui.message(failed_message)
> finally:
> self._onResult = None
>
> This particular section from onlineOCRHandler.py causes problems
> because "imageInfo" is not defined in the function:
>
> ERROR - external:globalPlugins.onlineOCR.onlineOCRHandler.BaseRecognizer.callback
> (08:27:15.871):
> global name 'imageInfo' is not defined
> ERROR - external:globalPlugins.onlineOCR.onlineOCRHandler.BaseRecognizer.callback
> (08:27:15.888):
> {u'SearchablePDFURL': u'Searchable PDF not generated as it was not
> requested.', u'OCRExitCode': 1, u'IsErroredOnProcessing': False,
> u'ParsedResults': [{u'FileParseExitCode': 1, u'TextOverlay':
> {u'HasOverlay': True, u'Lines':
> (snip)
>
> 3. Make sure that the add-on works when reloading the plug-ins without
> restart (NVDA+Control+F3)
> I get the following when doing so:
>
> ERROR - extensionPoints.Action.notify (07:59:08.701):
> Error running handler <bound method ?.handlePostConfigProfileSwitch of
> <class 'globalPlugins.onlineOCR.OnlineImageDescriberHandler.OnlineImageDescriberHandler'
> ,>> for <extensionPoints.Action object at 0x02AAE190>
> Traceback (most recent call last):
> File "extensionPoints\__init__.pyc", line 47, in notify
> File "extensionPoints\util.pyc", line 185, in callWithSupportedKwargs
> File "C:\Users\Robert
> Hänggi\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\abstractEngine.py",
> line 175, in handlePostConfigProfileSwitch
> File "C:\Users\Robert
> Hänggi\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\abstractEngine.py",
> line 312, in loadSettings
> File "baseObject.pyc", line 31, in __get__
> File "C:\Users\Robert
> Hänggi\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\imageDescribers\azureAnalyse.py",
> line 95, in _get_supportedSettings
> File "C:\Users\Robert
> Hänggi\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\OnlineImageDescriberHandler.py",
> line 550, in AccessTypeSetting
> AttributeError: 'NoneType' object has no attribute 'StringSettings'
>
> I think getting the trace back during development is better than
> catching all exceptions and just logging them, especially if the
> errors are nested.
>
> Of course, that's my subjective opinion and I don't want to criticise
> your programming style, it is probably better than mine.
>
> Cheers
> Robert
>
>
> On 27/03/2019, Noelia Ruiz <nrm1977@...> wrote:
>> Hi, your suggestion sounds good. I don't use numeric keys and use laptop
>> key with bloc uppercase as NVDA key, so I don't have problems with
>> combinations, but I agree with you.
>>
>> Cheers
>>
>>
>> El 27/03/2019 a las 22:16, Robert Hänggi escribió:
>>> Hi Noelia,
>>> I would swap the two gestures. NVDA+Alt for the normal functionality
>>> and NVDA+Shift+Control for the clipboard because pressing three keys
>>> is hard enough, no need for 4. Especially if you want to perform a
>>> double gesture.
>>>
>>> I wonder, shouldn't the NumPad be used, if available, after all, the
>>> gestures for object navigation are situated there.
>>> And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
>>> (I've seen the latter one in the dragAndDrop addOn but I don't know if
>>> that is still maintained).
>>>
>>> Robert
>>>
>>> On 27/03/2019, Noelia Ruiz <nrm1977@...> wrote:
>>>> Hi, another suggestion:
>>>> NVDA+control+shift are used, when pressing r, to object navigator, and
>>>> when pressing p, for clipboard images. And alt+nvda, for navigator
>>>> object in case of image descriptor, and with r for clipboard. This may
>>>> be inconsistent. Maybe better to use control+nvda+shift for navigator
>>>> object and nvda+alt for clipboard?
>>>>
>>>> Cheers
>>>>
>>>>
>>>> El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
>>>>> Hi again, the add-on is on the website. Anyway, I recommend you to
>>>>> format a bit the documentation, for example, you may want to put The
>>>>> first heading level 3 after level 1 to 2, and also fix lists, since
>>>>> after the log option the list finishes, etc.
>>>>> I have put asterisks at start, so that compatibility info, author and
>>>>> download links appear in a list, as done with other add-ons on the
>>>>> website.
>>>>> Also, the link using addonFiles address is used, to avoid issues in
>>>>> translations.
>>>>> When you declare the add-on as stable, we can register it to be
>>>>> translated. Now translators can start with documentation.
>>>>> https://addons.nvda-project.org/addons/onlineOCR.en.html
>>>>>
>>>>> Cheers
>>>>>
>>>>> https://addons.nvda-project.org/addons/onlineOCR.en.html
>>>>> El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
>>>>>> Hi, basic review results after adding image describers:
>>>>>>
>>>>>> - License and copyright: Pass.
>>>>>> - Security: pass.
>>>>>> - Documentation: Pass.
>>>>>> - User experience: pass with comments.
>>>>>>
>>>>>> Image describers can provide text results (not in browse mode), and
>>>>>> this has to be fixed. Also, when pressing the gesture twice, NVDA can
>>>>>> report that there is active another recognizer.
>>>>>> Anyway, seing that author is listening and responding feed-back, and
>>>>>> that Oliver work has been used and mentioned in documentation, and
>>>>>> moreover the code shows a deep experience looking at NVDA relevant
>>>>>> code, and also that the main structure is done, and fail refers to
>>>>>> part of some features, and that this should be tested in different
>>>>>> languages, I think we can post it in the development section of
>>>>>> website to share widely in different communities, since the add-on is
>>>>>> useful now and will be improved.
>>>>>> Here is my log. Since no objections are expressed, I will post the
>>>>>> add-on now, and this is the log for emage describer errors:
>>>>>>
>>>>>> INFO -
>>>>>> external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
>>>>>> (19:17:16.818):
>>>>>> (u'C:\\Users\\User\\Downloads\\ant.jpg',)
>>>>>> ERROR -
>>>>>> external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
>>>>>> (19:18:09.683):
>>>>>> 'SimpleTextResult' object is not iterable
>>>>>> ERROR -
>>>>>> external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
>>>>>> (19:18:09.683):
>>>>>> {u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
>>>>>> u'description': {u'captions': [{u'text': u'a drawing of a face',
>>>>>> u'confidence': 0.3644558611090458}], u'tags': [u'drawing', u'food']},
>>>>>> u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
>>>>>> ERROR -
>>>>>> external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
>>>>>> (19:18:13.302):
>>>>>> 'SimpleTextResult' object is not iterable
>>>>>> ERROR -
>>>>>> external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
>>>>>> (19:18:13.302):
>>>>>> {u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
>>>>>> u'description': {u'captions': [{u'text': u'a drawing of a face',
>>>>>> u'confidence': 0.3644558611090458}], u'tags': [u'drawing', u'food']},
>>>>>> u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
>>>>>> ERROR -
>>>>>> external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
>>>>>> (19:18:39.265):
>>>>>> 'SimpleTextResult' object is not iterable
>>>>>> ERROR -
>>>>>> external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
>>>>>> (19:18:39.265):
>>>>>> {u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
>>>>>> u'description': {u'captions': [{u'text': u'a drawing of a face',
>>>>>> u'confidence': 0.3644558611090458}], u'tags': [u'drawing', u'food']},
>>>>>> u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}
>>>>>>
>>>>>>
>>>>>>
>>>>>> El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
>>>>>>> Yes, I have seen an error with text result in browse mode with the
>>>>>>> image describer plugin. Anyway, I think it has a lot of feedback and
>>>>>>> just it could be added to development section of website, since the
>>>>>>> major part is done, it has passed basic review, the quality in
>>>>>>> general is good, and no major changes in documentation are going to
>>>>>>> happen for now. If you dont have objections, I will post this in
>>>>>>> about an hour for wide testing, and if minor version are released
>>>>>>> they will be posted, but now translators can start to translate
>>>>>>> documentation so that different communities can test the add-on with
>>>>>>> some quality and feedback, as a first filter.
>>>>>>> Cheers""
>>>>>>>
>>>>>>> Enviado desde mi iPhone
>>>>>>>
>>>>>>>> El 27 mar 2019, a las 17:43, Robert Hänggi <aarjay.robert@...>
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>> Hi
>>>>>>>> I'm glad that the addOn makes such huge steps.
>>>>>>>>
>>>>>>>> The browsable message doesn't seem to work for NVDA+Alt+P if I swap
>>>>>>>> the gestures.
>>>>>>>>
>>>>>>>> The description might need some formatting, at least some line
>>>>>>>> breaks.
>>>>>>>>
>>>>>>>> That's the result with azure:
>>>>>>>>
>>>>>>>> Categories: others_ outdoor_ text_sign
>>>>>>>> This image does not contain adult content This image contains racy
>>>>>>>> content
>>>>>>>> Dominant foreground color is Red. Dominant background color is
>>>>>>>> White.
>>>>>>>> Hex code of accent color is 000000. Dominant colors: White The image
>>>>>>>> is not black and white.
>>>>>>>> Tags: screenshot design pixel vector typography
>>>>>>>> The image is not a clip-art. The image is not a lineDrawing
>>>>>>>> Descriptions: a close up of a device
>>>>>>>>
>>>>>>>> (Note there is no racy content, LOL. It's the record button of
>>>>>>>> Audacity.)
>>>>>>>>
>>>>>>>> Empty tags could be omitted.
>>>>>>>> The accent colour could perhaps be represented as RGB or expressed
>>>>>>>> with the NVDA colour descriptions.
>>>>>>>>
>>>>>>>> Anyways, great work.
>>>>>>>>
>>>>>>>> Robert
>>>>>>>>
>>>>>>>>> On 27/03/2019, Noelia Ruiz <nrm1977@...> wrote:
>>>>>>>>> Hi, I will try to post the development version of this wonderful
>>>>>>>>> add-on on the website after job, in the evening, in about 2 hours
>>>>>>>>> or
>>>>>>>>> so.
>>>>>>>>> Personally, I feel a deep emotion for this. Of course, this doesn't
>>>>>>>>> replace a human description where we can ask questions and so on.
>>>>>>>>> I think this is an open door also for students in scientific
>>>>>>>>> degrees,
>>>>>>>>> where people can have a lot of problems and some of us have
>>>>>>>>> listened
>>>>>>>>> things like: "Why don't you try to drive a car?". Since some things
>>>>>>>>> can be identified by machine learning, or at least this can help
>>>>>>>>> us,
>>>>>>>>> and this is very important.
>>>>>>>>> I have a lot of job and hope other reviewers can help with new
>>>>>>>>> add-ons too.
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2019-03-27 16:15 GMT+01:00, Larry Wang <larry.wang.801@...>:
>>>>>>>>>> Hi everyone,
>>>>>>>>>>
>>>>>>>>>> Version 0.11 of online ocr addon is released.
>>>>>>>>>> Changes in this version include:
>>>>>>>>>> Change addon summary to online image describer
>>>>>>>>>> Added image description capability
>>>>>>>>>>
>>>>>>>>>> NVDA+Alt+P Recognize current navigator object Then read result. If
>>>>>>>>>> pressed
>>>>>>>>>> twice, open a virtual result document.
>>>>>>>>>>
>>>>>>>>>> Control+Shift+NVDA+P Recognizes image in clipboard . Then read
>>>>>>>>>> result. If
>>>>>>>>>> pressed twice, open a virtual result document.
>>>>>>>>>> Here are three engines available.
>>>>>>>>>>
>>>>>>>>>> ### Machine Learning Engine by Oliver Edholm
>>>>>>>>>> It's a free engine gives description of an image.
>>>>>>>>>> If there is text inside it will do OCR on the image.
>>>>>>>>>> There are two settings for this engine.
>>>>>>>>>> * Language of result:
>>>>>>>>>> English by default. If you configure another language than
>>>>>>>>>> English, the
>>>>>>>>>> description could have translation issues because it's
>>>>>>>>>> automatically
>>>>>>>>>> generated by machine translation service.
>>>>>>>>>>
>>>>>>>>>> Security:
>>>>>>>>>> * The images are sent to a script hosted on the Google Cloud
>>>>>>>>>> Platform for
>>>>>>>>>> analysis. After the analysis the image gets removed from the
>>>>>>>>>> server and
>>>>>>>>>> will never be seen again.
>>>>>>>>>>
>>>>>>>>>> The author of this addon has setup a proxy on www.nvdacn.com for
>>>>>>>>>> users
>>>>>>>>>> who
>>>>>>>>>> cannot access google service access.
>>>>>>>>>> If you want to use this proxy please chose Use proxy on
>>>>>>>>>> www.nvdacn.com in
>>>>>>>>>> access type settings.
>>>>>>>>>>
>>>>>>>>>> If you want to use your own key in the following two Microsoft
>>>>>>>>>> engines.
>>>>>>>>>> Please follow the guide in Microsoft Azure OCR section.
>>>>>>>>>> ### Microsoft Azure Image Analyser
>>>>>>>>>> This engine extracts a rich set of visual features based on the
>>>>>>>>>> image
>>>>>>>>>> content.
>>>>>>>>>> This engine is english only by now.
>>>>>>>>>>
>>>>>>>>>> Visual Features include:
>>>>>>>>>> Adult - detects if the image is pornographic in nature (depicts
>>>>>>>>>> nudity or
>>>>>>>>>> a
>>>>>>>>>> sex act). Sexually suggestive content is also detected.
>>>>>>>>>> Brands - detects various brands within an image, including the
>>>>>>>>>> approximate
>>>>>>>>>> location. The Brands argument is only available in English.
>>>>>>>>>> Categories - categorizes image content according to a taxonomy
>>>>>>>>>> defined in
>>>>>>>>>> documentation.
>>>>>>>>>> Color - determines the accent color, dominant color, and whether
>>>>>>>>>> an image
>>>>>>>>>> is black&white.
>>>>>>>>>> Description - describes the image content with a complete sentence
>>>>>>>>>> in
>>>>>>>>>> supported languages.
>>>>>>>>>> Faces - detects if faces are present. If present, generate
>>>>>>>>>> coordinates,
>>>>>>>>>> gender and age.
>>>>>>>>>> ImageType - detects if image is clip art or a line drawing.
>>>>>>>>>> Objects - detects various objects within an image, including the
>>>>>>>>>> approximate location. The Objects argument is only available in
>>>>>>>>>> English.
>>>>>>>>>> Tags - tags the image with a detailed list of words related to the
>>>>>>>>>> image
>>>>>>>>>> content.
>>>>>>>>>>
>>>>>>>>>> Some features also provide additional details:
>>>>>>>>>>
>>>>>>>>>> Celebrities - identifies celebrities if detected in the image.
>>>>>>>>>> Landmarks - identifies landmarks if detected in the image.
>>>>>>>>>>
>>>>>>>>>> ### Microsoft Azure Image describer
>>>>>>>>>>
>>>>>>>>>> This engine generates a description of an image in human readable
>>>>>>>>>> language
>>>>>>>>>> with complete sentences. The description is based on a collection
>>>>>>>>>> of
>>>>>>>>>> content tags, which are also returned by the operation. More than
>>>>>>>>>> one
>>>>>>>>>> description can be generated for each image. Descriptions are
>>>>>>>>>> ordered by
>>>>>>>>>> their confidence score.
>>>>>>>>>> There are two settings for this engine.
>>>>>>>>>> * Language
>>>>>>>>>> The language in which the service will return a description of the
>>>>>>>>>> image.
>>>>>>>>>> English by default.
>>>>>>>>>>
>>>>>>>>>> * Max Candidates
>>>>>>>>>> Maximum number of candidate descriptions to be returned. The
>>>>>>>>>> default is
>>>>>>>>>> 1.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Here is the direct download link
>>>>>>>>>>
>>>>>>>>>> https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Larry
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz <nrm1977@...>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> In case it's useful, more suggestion:
>>>>>>>>>>> - In the documentation presented in input mode for
>>>>>>>>>>> NVDA+shift+control+r
>>>>>>>>>>> announces that recognizes the content and opens a virtual result,
>>>>>>>>>>> but
>>>>>>>>>>> this is configurable. This may be mentioned indicating that the
>>>>>>>>>>> result
>>>>>>>>>>> could be spoken and braillified or presented in a virtual
>>>>>>>>>>> document.
>>>>>>>>>>> - Regarding the code, I have seen that sometimes it's used
>>>>>>>>>>> something
>>>>>>>>>>> like:
>>>>>>>>>>> if something: return
>>>>>>>>>>> else...
>>>>>>>>>>> But I think in some cases else after return is not needed, since
>>>>>>>>>>> if the
>>>>>>>>>>> previous condition exists, the script return and no more can
>>>>>>>>>>> happen. I
>>>>>>>>>>> made this mistake years ago and Mesar Hameed, a great person and
>>>>>>>>>>> developer, who created initially the scripts used for the
>>>>>>>>>>> translation
>>>>>>>>>>> system and make add-ons translatable by NVDA's translators
>>>>>>>>>>> exchanging
>>>>>>>>>>> messages between add-ons stored on Bitbucket team account and the
>>>>>>>>>>> translators repository, also first creator of guidelines for
>>>>>>>>>>> reviewers
>>>>>>>>>>> and authors and so on, fixed me teaching this mistake made by me,
>>>>>>>>>>> in
>>>>>>>>>>> case you are interested.
>>>>>>>>>>> When you want, for example when image descriptions are added, we
>>>>>>>>>>> can
>>>>>>>>>>> post the add-on on the website, and if this is joined with
>>>>>>>>>>> ImageDescriber, we will fix things later. I say this since maybe
>>>>>>>>>>> complicated to join these add-ons, imo, at least currently, since
>>>>>>>>>>> the
>>>>>>>>>>> two interfaces are very different. Anyway, if you want to wait to
>>>>>>>>>>> post
>>>>>>>>>>> the add-on, this is OK.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Cheers
>>>>>>>>>>>
>>>>>>>>>>>> El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io escribió:
>>>>>>>>>>>> Sorry, another thought: I think that Larry could accept pull
>>>>>>>>>>>> request
>>>>>>>>>>>> if the features contained in ImageDescribed are properly
>>>>>>>>>>>> integrated in
>>>>>>>>>>>> online_ocr. For example, I don't know a good idea just to join
>>>>>>>>>>>> the two
>>>>>>>>>>>> add-ons, and I would understand that for any reason this could
>>>>>>>>>>>> be done
>>>>>>>>>>>> and they remain as independent add-ons. Please don't feel any
>>>>>>>>>>>> pressure
>>>>>>>>>>>> for our part (mentioning Robert and me).
>>>>>>>>>>>> Just for clarify
>>>>>>>>>>>>
>>>>>>>>>>>> 2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
>>>>>>>>>>>> <nrm1977@...>:
>>>>>>>>>>>>> Hi, this is great news. I think that Larry may accept pull
>>>>>>>>>>>>> requests,
>>>>>>>>>>>>> since the add-on gui seems to be flexible, accepting profiles,
>>>>>>>>>>>>> integrated in NVDA settings, and accepting content recog
>>>>>>>>>>>>> messages,
>>>>>>>>>>>>> etc. Also, I think that creating pull request in this add-on
>>>>>>>>>>>>> can be
>>>>>>>>>>>>> easy, using the abstract class for plugins. So, imo, if both
>>>>>>>>>>>>> you
>>>>>>>>>>>>> agree, I would vote for adding just one joined add-on on the
>>>>>>>>>>>>> website.
>>>>>>>>>>>>> ImageDescriber could be a better name since it's more generic,
>>>>>>>>>>>>> though
>>>>>>>>>>>>> I definitely would use Larry's UI and framework. What do you
>>>>>>>>>>>>> think?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers and thanks both for your work.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2019-03-22 11:25 GMT+01:00, Robert Hänggi
>>>>>>>>>>>>> <aarjay.robert@...>:
>>>>>>>>>>>>>> Oliver, I'm very glad that you're open to collaboration.
>>>>>>>>>>>>>> I think you're joined efforts would lead to a great add-on.
>>>>>>>>>>>>>> Apart from that, it may help to avoid clones of 3rd party
>>>>>>>>>>>>>> modules
>>>>>>>>>>>>>> present in both add-ons and thus reducing storage and load
>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I could perhaps also help with mouse related stuff (e.g.
>>>>>>>>>>>>>> recognition
>>>>>>>>>>>>>> of unknown pointer shapes automatically).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>>> Robert
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 22/03/2019, Oliver Edholm <oliver.edholm@...> wrote:
>>>>>>>>>>>>>>> Hi! It's the creator of the Image Describer add-on. Looked a
>>>>>>>>>>>>>>> little
>>>>>>>>>>> bit
>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>> the code and it looks very well made.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I agree that we do similar things here. Yesterday I added OCR
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> Image
>>>>>>>>>>>>>>> Describer backend for example (without knowing about this
>>>>>>>>>>>>>>> add-on).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm also working on a free Captcha Solver which seems to be
>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>> Larry
>>>>>>>>>>>>>>> has also looked into.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also regarding the example of utilizing more methods from AI
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> these
>>>>>>>>>>>>>>> problems it's also something I've been working on. You gave
>>>>>>>>>>>>>>> the
>>>>>>>>>>> example
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>> icon classification, I trained a classifier for this ~2 weeks
>>>>>>>>>>>>>>> ago
>>>>>>>>>>> which
>>>>>>>>>>>>>>> intend to eventually add to the Image Describer add-on.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm definitely open for some sort of collaboration here.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Wouldn't AI conflict with the objectDescriber add-on?
>>>>>>>>>>>>>>>> I mean, two add-ons with essentially the same functionality
>>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>> bit
>>>>>>>>>>>>>>>> strange.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> However, I see certainly positive results when AI is
>>>>>>>>>>>>>>>> integrated.
>>>>>>>>>>>>>>>> Imagine a inaccessible application. A lot of times, it does
>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>> only
>>>>>>>>>>>>>>>> use text but also symbols and it would be great if general
>>>>>>>>>>>>>>>> once
>>>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>>> be recognized.
>>>>>>>>>>>>>>>> - disc icons
>>>>>>>>>>>>>>>> - cog wheel/ wrench icons
>>>>>>>>>>>>>>>> - transport controls: play/fast forward/rewind icons
>>>>>>>>>>>>>>>> - home icon
>>>>>>>>>>>>>>>> - triangles (for drop down menus)
>>>>>>>>>>>>>>>> - check boxes, radio buttons, folder icons (= browse for
>>>>>>>>>>>>>>>> folder)
>>>>>>>>>>>>>>>> and
>>>>>>>>>>> so
>>>>>>>>>>>>>>>> on.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Additionally, simple rectangles which could indicate form
>>>>>>>>>>>>>>>> fields
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> edit boxes. Compare the auto form detection used in Acrobat
>>>>>>>>>>>>>>>> Pro DC
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> (By the way, I make use (in my version of golden cursor)
>>>>>>>>>>>>>>>> changing
>>>>>>>>>>>>>>>> mouse icons to determine e.g. the boundaries of an edit box
>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>> scroll
>>>>>>>>>>>>>>>> bars or sliders.)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Another thing that I would like to see:
>>>>>>>>>>>>>>>> If one presses enter in a recognition result, the default
>>>>>>>>>>>>>>>> action
>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>> performed. So far so good but here are some possible
>>>>>>>>>>>>>>>> improvements:
>>>>>>>>>>>>>>>> Case 1, when the recognition result stays open.
>>>>>>>>>>>>>>>> - perform a new recognition and update the document to
>>>>>>>>>>>>>>>> reflect
>>>>>>>>>>> possible
>>>>>>>>>>>>>>>> changes.
>>>>>>>>>>>>>>>> Case 2, when the navigator object changes and the document
>>>>>>>>>>>>>>>> closes.
>>>>>>>>>>>>>>>> - if the new navigator object is not a known class (e.g.
>>>>>>>>>>>>>>>> window or
>>>>>>>>>>>>>>>> system client or generic, without children) do as for case
>>>>>>>>>>>>>>>> 1.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best
>>>>>>>>>>>>>>>> Robert
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 21/03/2019, Noelia Ruiz <nrm1977@...> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> control nvda shift i is assigned to readfeeds add-on, but
>>>>>>>>>>>>>>>>> feel
>>>>>>>>>>>>>>>>> free
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> assign it since readfeeds is an add-on, maintained and
>>>>>>>>>>>>>>>>> initially
>>>>>>>>>>>>>>>>> creadted
>>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>> me (smile).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Enviado desde mi iPhone
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> El 21 mar 2019, a las 8:17, Larry Wang
>>>>>>>>>>>>>>>>>> <larry.wang.801@...>
>>>>>>>>>>>>>>>>>> escribió:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> AI recognition is not hard to add since it does similar
>>>>>>>>>>>>>>>>>> operation
>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>> image, I have found several engines too. But we need
>>>>>>>>>>>>>>>>>> another two
>>>>>>>>>>>>>>>>>> gestures,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I unassigned?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 2019/3/21 12:53, Noelia Ruiz wrote:
>>>>>>>>>>>>>>>>>>> OK, let's wait some hours, and tonight or tomorrow I will
>>>>>>>>>>>>>>>>>>> try
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> post
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> add-on on the website if all is OK. For now I will upload
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> dev
>>>>>>>>>>>>>>>>>>> version
>>>>>>>>>>>>>>>>>>> with one link, and when you declare the add-on as stable
>>>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>> post
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> link to stable version too and will register the add-on
>>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>> translation
>>>>>>>>>>>>>>>>>>> system.
>>>>>>>>>>>>>>>>>>> I would like if you coul add AI recognition of images,
>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>> instance
>>>>>>>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>>>>> they contain a man, a room, blond or brown hair, etc.
>>>>>>>>>>>>>>>>>>> This is a wonderful work!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> El 21/03/2019 a las 1:47, Larry Wang escribió:
>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I am happy to release version 0.10 of online ocr addon.
>>>>>>>>>>>>>>>>>>>> Changes in this version include
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Fix error using user's own api key in sougou API
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Fix unknown panel in sougou API settings
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Here is the direct download link
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>> https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>> Larry
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>


Re: Online OCR addon #addonrequestreview

Robert Hänggi
 

Thanks Larry
LOL
I'm a musician but no, I don't use all four fingers for the double
press as if it was a piano chord...
I think more along the line what NVDA modifier the user might have
defined, caps lock, insert or insert NumPad or a combination of them.
Depending on the configuration the gesture is easier or more difficult to grab.
I will personally assign OCR to NVDA+Alt+O and NVDA+Control+Shift+O
for the simple reason that they are close to the P variant and I don't
have to train my muscle memory for two different hand positions.



Can it be that the links above have a typo, isn't there an s too much
before dev?

Also, it might be me but the faked object (imageDescription) seems to
have the wrong line breaks, it is all on one line in the document
result.

Best
Robert

On 28/03/2019, Larry Wang <larry.wang.801@gmail.com> wrote:
Hi all,

0.13 version of online image describer addon is released

Changes in this version include

Make sure that the add-on works when reloading the plug-ins without
restart (NVDA+Control+F3)

Here is the download link:

https://github.com/larry801/online_ocr/releases/download/0.13-dev/onlineOCR-0.13-sdev.nvda-addon)

Cheers,

Larry

On 2019/3/28 16:43, Larry Wang wrote:
Hi all,

0.12 version of online image describer addon is released

Changes in this version include



Re: Online OCR addon #addonrequestreview

Noelia Ruiz
 

OK, now I have a short work to attend in real time, but later I will
try to give you access to addonFiles.
If you are a member of the translation team, you can also update the
documentation on the website. You can ask an invitation otherwise.


2019-03-28 11:20 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:

Hi Noelia,

If Robert's issue is fixed, I think it is better to post 0.13.

I have a bitbucket account and I am willing to post versions by my self.

Actually this addon was on a private repo on bitbucket before I put it
on github.

This is my profile page.

https://bitbucket.org/Rheinmetal/

On 2019/3/28 17:37, Noelia Ruiz wrote:
Hi, let us know if we should post 0.12 or 0.13 with fixed profile issue.
Also, for add-ons reviewed by me, if authors show Git experience, I
try to offer write access to addonFiles repo if they have a Bitbucket
account, so authors can update versions themselves on the website, if
they want. Otherwise, we can do it.
So, if you have a Bitbucket account and want write access to
addonFiles, let us know. Just be patient, since Bitbucket is not very
comfortable and provide this access can be hard and, at least in my
case, sometimes I need to retry it several times.

Cheers


2019-03-28 9:51 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi Noelia

1. Documentation problem mentioned above is corrected in 0.12

2. Access type is valid for Oliver's engine. That is because I provided
a proxy for users without access to google, by default it connects
directly to google .
3. I will look into how to use a checkbox. The checkbox in advanced
panel may be a good candidate.


On 2019/3/28 15:10, Noelia Ruiz wrote:
Also, the plugins based on Oliver's work seems not to need an api key.
Then I think that the combo box should be disabled, since the combo
box label refers to api key, but options correspond to proxy.
About the details supported by plugins like Azure Analizer, I think
that a list of checkbox may be better than a combo box.
Also, I saw that in some part imageInfo appears as not defined. This
is normal, since the add-on is very complex and have a lot of plugins.
Anyway, we can update the version and formatted documentation easily.

Cheers


El 27/03/2019 a las 23:05, Noelia Ruiz via Groups.Io escribió:
Hi, your suggestion sounds good. I don't use numeric keys and use
laptop key with bloc uppercase as NVDA key, so I don't have problems
with combinations, but I agree with you.

Cheers


El 27/03/2019 a las 22:16, Robert Hänggi escribió:
Hi Noelia,
I would swap the two gestures. NVDA+Alt for the normal functionality
and NVDA+Shift+Control for the clipboard because pressing three keys
is hard enough, no need for 4. Especially if you want to perform a
double gesture.

I wonder, shouldn't the NumPad be used, if available, after all, the
gestures for object navigation are situated there.
And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
(I've seen the latter one in the dragAndDrop addOn but I don't know
if
that is still maintained).

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, another suggestion:
NVDA+control+shift are used, when pressing r, to object navigator,
and
when pressing p, for clipboard images. And alt+nvda, for navigator
object in case of image descriptor, and with r for clipboard. This
may
be inconsistent. Maybe better to use control+nvda+shift for
navigator
object and nvda+alt for clipboard?

Cheers


El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
Hi again, the add-on is on the website. Anyway, I recommend you to
format a bit the documentation, for example, you may want to put
The
first heading level 3 after level 1 to 2, and also fix lists, since
after the log option the list finishes, etc.
I have put asterisks at start, so that compatibility info, author
and
download links appear in a list, as done with other add-ons on the
website.
Also, the link using addonFiles address is used, to avoid issues in
translations.
When you declare the add-on as stable, we can register it to be
translated. Now translators can start with documentation.
https://addons.nvda-project.org/addons/onlineOCR.en.html

Cheers

https://addons.nvda-project.org/addons/onlineOCR.en.html
El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
Hi, basic review results after adding image describers:

- License and copyright: Pass.
- Security: pass.
- Documentation: Pass.
- User experience: pass with comments.

Image describers can provide text results (not in browse mode),
and
this has to be fixed. Also, when pressing the gesture twice, NVDA
can
report that there is active another recognizer.
Anyway, seing that author is listening and responding feed-back,
and
that Oliver work has been used and mentioned in documentation, and
moreover the code shows a deep experience looking at NVDA relevant
code, and also that the main structure is done, and fail refers to
part of some features, and that this should be tested in different
languages, I think we can post it in the development section of
website to share widely in different communities, since the
add-on is
useful now and will be improved.
Here is my log. Since no objections are expressed, I will post the
add-on now, and this is the log for emage describer errors:

INFO -
external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(19:17:16.818):
(u'C:\\Users\\User\\Downloads\\ant.jpg',)
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}



El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
Yes, I have seen an error with text result in browse mode with
the
image describer plugin. Anyway, I think it has a lot of feedback
and
just it could be added to development section of website, since
the
major part is done, it has passed basic review, the quality in
general is good, and no major changes in documentation are going
to
happen for now. If you dont have objections, I will post this in
about an hour for wide testing, and if minor version are released
they will be posted, but now translators can start to translate
documentation so that different communities can test the add-on
with
some quality and feedback, as a first filter.
Cheers""

Enviado desde mi iPhone

El 27 mar 2019, a las 17:43, Robert Hänggi
<aarjay.robert@gmail.com>
escribió:

Hi
I'm glad that the addOn makes such huge steps.

The browsable message doesn't seem to work for NVDA+Alt+P if I
swap
the gestures.

The description might need some formatting, at least some line
breaks.

That's the result with azure:

Categories: others_ outdoor_ text_sign
This image does not contain adult content This image contains
racy
content
Dominant foreground color is Red. Dominant background color is
White.
Hex code of accent color is 000000. Dominant colors: White The
image
is not black and white.
Tags: screenshot design pixel vector typography
The image is not a clip-art. The image is not a lineDrawing
Descriptions: a close up of a device

(Note there is no racy content, LOL. It's the record button of
Audacity.)

Empty tags could be omitted.
The accent colour could perhaps be represented as RGB or
expressed
with the NVDA colour descriptions.

Anyways, great work.

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, I will try to post the development version of this
wonderful
add-on on the website after job, in the evening, in about 2
hours or
so.
Personally, I feel a deep emotion for this. Of course, this
doesn't
replace a human description where we can ask questions and so
on.
I think this is an open door also for students in scientific
degrees,
where people can have a lot of problems and some of us have
listened
things like: "Why don't you try to drive a car?". Since some
things
can be identified by machine learning, or at least this can
help us,
and this is very important.
I have a lot of job and hope other reviewers can help with new
add-ons too.

Cheers


2019-03-27 16:15 GMT+01:00, Larry Wang
<larry.wang.801@gmail.com>:
Hi everyone,

Version 0.11 of online ocr addon is released.
Changes in this version include:
Change addon summary to online image describer
Added image description capability

NVDA+Alt+P Recognize current navigator object Then read
result. If
pressed
twice, open a virtual result document.

Control+Shift+NVDA+P Recognizes image in clipboard . Then read
result. If
pressed twice, open a virtual result document.
Here are three engines available.

### Machine Learning Engine by Oliver Edholm
It's a free engine gives description of an image.
If there is text inside it will do OCR on the image.
There are two settings for this engine.
* Language of result:
English by default. If you configure another language than
English, the
description could have translation issues because it's
automatically
generated by machine translation service.

Security:
* The images are sent to a script hosted on the Google Cloud
Platform for
analysis. After the analysis the image gets removed from the
server and
will never be seen again.

The author of this addon has setup a proxy on www.nvdacn.com
for
users
who
cannot access google service access.
If you want to use this proxy please chose Use proxy on
www.nvdacn.com in
access type settings.

If you want to use your own key in the following two Microsoft
engines.
Please follow the guide in Microsoft Azure OCR section.
### Microsoft Azure Image Analyser
This engine extracts a rich set of visual features based on
the
image
content.
This engine is english only by now.

Visual Features include:
Adult - detects if the image is pornographic in nature
(depicts
nudity or
a
sex act). Sexually suggestive content is also detected.
Brands - detects various brands within an image, including the
approximate
location. The Brands argument is only available in English.
Categories - categorizes image content according to a taxonomy
defined in
documentation.
Color - determines the accent color, dominant color, and
whether
an image
is black&white.
Description - describes the image content with a complete
sentence
in
supported languages.
Faces - detects if faces are present. If present, generate
coordinates,
gender and age.
ImageType - detects if image is clip art or a line drawing.
Objects - detects various objects within an image, including
the
approximate location. The Objects argument is only available
in
English.
Tags - tags the image with a detailed list of words related
to the
image
content.

Some features also provide additional details:

Celebrities - identifies celebrities if detected in the image.
Landmarks - identifies landmarks if detected in the image.

### Microsoft Azure Image describer

This engine generates a description of an image in human
readable
language
with complete sentences. The description is based on a
collection of
content tags, which are also returned by the operation. More
than
one
description can be generated for each image. Descriptions are
ordered by
their confidence score.
There are two settings for this engine.
* Language
The language in which the service will return a description
of the
image.
English by default.

* Max Candidates
Maximum number of candidate descriptions to be returned. The
default is
1.


Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon






Cheers,
Larry



On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz
<nrm1977@gmail.com>
wrote:

In case it's useful, more suggestion:
- In the documentation presented in input mode for
NVDA+shift+control+r
announces that recognizes the content and opens a virtual
result,
but
this is configurable. This may be mentioned indicating that
the
result
could be spoken and braillified or presented in a virtual
document.
- Regarding the code, I have seen that sometimes it's used
something
like:
if something: return
else...
But I think in some cases else after return is not needed,
since
if the
previous condition exists, the script return and no more can
happen. I
made this mistake years ago and Mesar Hameed, a great person
and
developer, who created initially the scripts used for the
translation
system and make add-ons translatable by NVDA's translators
exchanging
messages between add-ons stored on Bitbucket team account
and the
translators repository, also first creator of guidelines for
reviewers
and authors and so on, fixed me teaching this mistake made
by me,
in
case you are interested.
When you want, for example when image descriptions are
added, we
can
post the add-on on the website, and if this is joined with
ImageDescriber, we will fix things later. I say this since
maybe
complicated to join these add-ons, imo, at least currently,
since
the
two interfaces are very different. Anyway, if you want to
wait to
post
the add-on, this is OK.


Cheers

El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io
escribió:
Sorry, another thought: I think that Larry could accept pull
request
if the features contained in ImageDescribed are properly
integrated in
online_ocr. For example, I don't know a good idea just to
join
the two
add-ons, and I would understand that for any reason this
could
be done
and they remain as independent add-ons. Please don't feel
any
pressure
for our part (mentioning Robert and me).
Just for clarify

2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi, this is great news. I think that Larry may accept pull
requests,
since the add-on gui seems to be flexible, accepting
profiles,
integrated in NVDA settings, and accepting content recog
messages,
etc. Also, I think that creating pull request in this
add-on
can be
easy, using the abstract class for plugins. So, imo, if
both you
agree, I would vote for adding just one joined add-on on
the
website.
ImageDescriber could be a better name since it's more
generic,
though
I definitely would use Larry's UI and framework. What do
you
think?

Cheers and thanks both for your work.

2019-03-22 11:25 GMT+01:00, Robert Hänggi
<aarjay.robert@gmail.com>:
Oliver, I'm very glad that you're open to collaboration.
I think you're joined efforts would lead to a great
add-on.
Apart from that, it may help to avoid clones of 3rd party
modules
present in both add-ons and thus reducing storage and
load time.

I could perhaps also help with mouse related stuff (e.g.
recognition
of unknown pointer shapes automatically).

Cheers
Robert

On 22/03/2019, Oliver Edholm <oliver.edholm@gmail.com>
wrote:
Hi! It's the creator of the Image Describer add-on.
Looked a
little
bit
at
the code and it looks very well made.

I agree that we do similar things here. Yesterday I
added OCR
to
the
Image
Describer backend for example (without knowing about this
add-on).

I'm also working on a free Captcha Solver which seems to
be
something
Larry
has also looked into.

Also regarding the example of utilizing more methods
from AI to
these
problems it's also something I've been working on. You
gave the
example
of
icon classification, I trained a classifier for this ~2
weeks
ago
which
intend to eventually add to the Image Describer add-on.

I'm definitely open for some sort of collaboration here.

On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:


Wouldn't AI conflict with the objectDescriber add-on?
I mean, two add-ons with essentially the same
functionality
is a
bit
strange.

However, I see certainly positive results when AI is
integrated.
Imagine a inaccessible application. A lot of times, it
does
not
only
use text but also symbols and it would be great if
general
once
could
be recognized.
- disc icons
- cog wheel/ wrench icons
- transport controls: play/fast forward/rewind icons
- home icon
- triangles (for drop down menus)
- check boxes, radio buttons, folder icons (= browse for
folder)
and
so
on.

Additionally, simple rectangles which could indicate
form
fields
and
edit boxes. Compare the auto form detection used in
Acrobat
Pro DC

(By the way, I make use (in my version of golden cursor)
changing
mouse icons to determine e.g. the boundaries of an edit
box or
scroll
bars or sliders.)

Another thing that I would like to see:
If one presses enter in a recognition result, the
default
action
is
performed. So far so good but here are some possible
improvements:
Case 1, when the recognition result stays open.
- perform a new recognition and update the document to
reflect
possible
changes.
Case 2, when the navigator object changes and the
document
closes.
- if the new navigator object is not a known class (e.g.
window or
system client or generic, without children) do as for
case 1.

Best
Robert



On 21/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:

control nvda shift i is assigned to readfeeds add-on,
but
feel
free
to
assign it since readfeeds is an add-on, maintained and
initially
creadted
by
me (smile).

Enviado desde mi iPhone


El 21 mar 2019, a las 8:17, Larry Wang
<larry.wang.801@gmail.com>
escribió:

AI recognition is not hard to add since it does
similar
operation
on
an
image, I have found several engines too. But we need
another two
gestures,

is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I unassigned?


On 2019/3/21 12:53, Noelia Ruiz wrote:
OK, let's wait some hours, and tonight or tomorrow I
will
try
to
post
the
add-on on the website if all is OK. For now I will
upload
the
dev
version
with one link, and when you declare the add-on as
stable we
will
post
the
link to stable version too and will register the
add-on in
translation
system.
I would like if you coul add AI recognition of
images, for
instance
if
they contain a man, a room, blond or brown hair, etc.
This is a wonderful work!

El 21/03/2019 a las 1:47, Larry Wang escribió:
Hi everyone,

I am happy to release version 0.10 of online ocr
addon.
Changes in this version include

Fix error using user's own api key in sougou API

Fix unknown panel in sougou API settings

Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon






Cheers,
Larry

















Re: Online OCR addon #addonrequestreview

Larry Wang
 

Hi Noelia,

If Robert's issue is fixed, I think it is better to post 0.13.

I have a bitbucket account and I am willing to post versions by my self.

Actually this addon was on a private repo on bitbucket before I put it on github.

This is my profile page.

https://bitbucket.org/Rheinmetal/

On 2019/3/28 17:37, Noelia Ruiz wrote:
Hi, let us know if we should post 0.12 or 0.13 with fixed profile issue.
Also, for add-ons reviewed by me, if authors show Git experience, I
try to offer write access to addonFiles repo if they have a Bitbucket
account, so authors can update versions themselves on the website, if
they want. Otherwise, we can do it.
So, if you have a Bitbucket account and want write access to
addonFiles, let us know. Just be patient, since Bitbucket is not very
comfortable and provide this access can be hard and, at least in my
case, sometimes I need to retry it several times.

Cheers


2019-03-28 9:51 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi Noelia

1. Documentation problem mentioned above is corrected in 0.12

2. Access type is valid for Oliver's engine. That is because I provided
a proxy for users without access to google, by default it connects
directly to google .
3. I will look into how to use a checkbox. The checkbox in advanced
panel may be a good candidate.


On 2019/3/28 15:10, Noelia Ruiz wrote:
Also, the plugins based on Oliver's work seems not to need an api key.
Then I think that the combo box should be disabled, since the combo
box label refers to api key, but options correspond to proxy.
About the details supported by plugins like Azure Analizer, I think
that a list of checkbox may be better than a combo box.
Also, I saw that in some part imageInfo appears as not defined. This
is normal, since the add-on is very complex and have a lot of plugins.
Anyway, we can update the version and formatted documentation easily.

Cheers


El 27/03/2019 a las 23:05, Noelia Ruiz via Groups.Io escribió:
Hi, your suggestion sounds good. I don't use numeric keys and use
laptop key with bloc uppercase as NVDA key, so I don't have problems
with combinations, but I agree with you.

Cheers


El 27/03/2019 a las 22:16, Robert Hänggi escribió:
Hi Noelia,
I would swap the two gestures. NVDA+Alt for the normal functionality
and NVDA+Shift+Control for the clipboard because pressing three keys
is hard enough, no need for 4. Especially if you want to perform a
double gesture.

I wonder, shouldn't the NumPad be used, if available, after all, the
gestures for object navigation are situated there.
And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
(I've seen the latter one in the dragAndDrop addOn but I don't know if
that is still maintained).

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, another suggestion:
NVDA+control+shift are used, when pressing r, to object navigator, and
when pressing p, for clipboard images. And alt+nvda, for navigator
object in case of image descriptor, and with r for clipboard. This may
be inconsistent. Maybe better to use control+nvda+shift for navigator
object and nvda+alt for clipboard?

Cheers


El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
Hi again, the add-on is on the website. Anyway, I recommend you to
format a bit the documentation, for example, you may want to put The
first heading level 3 after level 1 to 2, and also fix lists, since
after the log option the list finishes, etc.
I have put asterisks at start, so that compatibility info, author and
download links appear in a list, as done with other add-ons on the
website.
Also, the link using addonFiles address is used, to avoid issues in
translations.
When you declare the add-on as stable, we can register it to be
translated. Now translators can start with documentation.
https://addons.nvda-project.org/addons/onlineOCR.en.html

Cheers

https://addons.nvda-project.org/addons/onlineOCR.en.html
El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
Hi, basic review results after adding image describers:

- License and copyright: Pass.
- Security: pass.
- Documentation: Pass.
- User experience: pass with comments.

Image describers can provide text results (not in browse mode), and
this has to be fixed. Also, when pressing the gesture twice, NVDA
can
report that there is active another recognizer.
Anyway, seing that author is listening and responding feed-back, and
that Oliver work has been used and mentioned in documentation, and
moreover the code shows a deep experience looking at NVDA relevant
code, and also that the main structure is done, and fail refers to
part of some features, and that this should be tested in different
languages, I think we can post it in the development section of
website to share widely in different communities, since the
add-on is
useful now and will be improved.
Here is my log. Since no objections are expressed, I will post the
add-on now, and this is the log for emage describer errors:

INFO -
external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(19:17:16.818):
(u'C:\\Users\\User\\Downloads\\ant.jpg',)
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}



El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
Yes, I have seen an error with text result in browse mode with the
image describer plugin. Anyway, I think it has a lot of feedback
and
just it could be added to development section of website, since the
major part is done, it has passed basic review, the quality in
general is good, and no major changes in documentation are going to
happen for now. If you dont have objections, I will post this in
about an hour for wide testing, and if minor version are released
they will be posted, but now translators can start to translate
documentation so that different communities can test the add-on
with
some quality and feedback, as a first filter.
Cheers""

Enviado desde mi iPhone

El 27 mar 2019, a las 17:43, Robert Hänggi
<aarjay.robert@gmail.com>
escribió:

Hi
I'm glad that the addOn makes such huge steps.

The browsable message doesn't seem to work for NVDA+Alt+P if I
swap
the gestures.

The description might need some formatting, at least some line
breaks.

That's the result with azure:

Categories: others_ outdoor_ text_sign
This image does not contain adult content This image contains racy
content
Dominant foreground color is Red. Dominant background color is
White.
Hex code of accent color is 000000. Dominant colors: White The
image
is not black and white.
Tags: screenshot design pixel vector typography
The image is not a clip-art. The image is not a lineDrawing
Descriptions: a close up of a device

(Note there is no racy content, LOL. It's the record button of
Audacity.)

Empty tags could be omitted.
The accent colour could perhaps be represented as RGB or expressed
with the NVDA colour descriptions.

Anyways, great work.

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, I will try to post the development version of this wonderful
add-on on the website after job, in the evening, in about 2
hours or
so.
Personally, I feel a deep emotion for this. Of course, this
doesn't
replace a human description where we can ask questions and so on.
I think this is an open door also for students in scientific
degrees,
where people can have a lot of problems and some of us have
listened
things like: "Why don't you try to drive a car?". Since some
things
can be identified by machine learning, or at least this can
help us,
and this is very important.
I have a lot of job and hope other reviewers can help with new
add-ons too.

Cheers


2019-03-27 16:15 GMT+01:00, Larry Wang
<larry.wang.801@gmail.com>:
Hi everyone,

Version 0.11 of online ocr addon is released.
Changes in this version include:
Change addon summary to online image describer
Added image description capability

NVDA+Alt+P Recognize current navigator object Then read
result. If
pressed
twice, open a virtual result document.

Control+Shift+NVDA+P Recognizes image in clipboard . Then read
result. If
pressed twice, open a virtual result document.
Here are three engines available.

### Machine Learning Engine by Oliver Edholm
It's a free engine gives description of an image.
If there is text inside it will do OCR on the image.
There are two settings for this engine.
* Language of result:
English by default. If you configure another language than
English, the
description could have translation issues because it's
automatically
generated by machine translation service.

Security:
* The images are sent to a script hosted on the Google Cloud
Platform for
analysis. After the analysis the image gets removed from the
server and
will never be seen again.

The author of this addon has setup a proxy on www.nvdacn.com for
users
who
cannot access google service access.
If you want to use this proxy please chose Use proxy on
www.nvdacn.com in
access type settings.

If you want to use your own key in the following two Microsoft
engines.
Please follow the guide in Microsoft Azure OCR section.
### Microsoft Azure Image Analyser
This engine extracts a rich set of visual features based on the
image
content.
This engine is english only by now.

Visual Features include:
Adult - detects if the image is pornographic in nature (depicts
nudity or
a
sex act). Sexually suggestive content is also detected.
Brands - detects various brands within an image, including the
approximate
location. The Brands argument is only available in English.
Categories - categorizes image content according to a taxonomy
defined in
documentation.
Color - determines the accent color, dominant color, and whether
an image
is black&white.
Description - describes the image content with a complete
sentence
in
supported languages.
Faces - detects if faces are present. If present, generate
coordinates,
gender and age.
ImageType - detects if image is clip art or a line drawing.
Objects - detects various objects within an image, including the
approximate location. The Objects argument is only available in
English.
Tags - tags the image with a detailed list of words related
to the
image
content.

Some features also provide additional details:

Celebrities - identifies celebrities if detected in the image.
Landmarks - identifies landmarks if detected in the image.

### Microsoft Azure Image describer

This engine generates a description of an image in human
readable
language
with complete sentences. The description is based on a
collection of
content tags, which are also returned by the operation. More
than
one
description can be generated for each image. Descriptions are
ordered by
their confidence score.
There are two settings for this engine.
* Language
The language in which the service will return a description
of the
image.
English by default.

* Max Candidates
Maximum number of candidate descriptions to be returned. The
default is
1.


Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon






Cheers,
Larry



On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz <nrm1977@gmail.com>
wrote:

In case it's useful, more suggestion:
- In the documentation presented in input mode for
NVDA+shift+control+r
announces that recognizes the content and opens a virtual
result,
but
this is configurable. This may be mentioned indicating that the
result
could be spoken and braillified or presented in a virtual
document.
- Regarding the code, I have seen that sometimes it's used
something
like:
if something: return
else...
But I think in some cases else after return is not needed,
since
if the
previous condition exists, the script return and no more can
happen. I
made this mistake years ago and Mesar Hameed, a great person
and
developer, who created initially the scripts used for the
translation
system and make add-ons translatable by NVDA's translators
exchanging
messages between add-ons stored on Bitbucket team account
and the
translators repository, also first creator of guidelines for
reviewers
and authors and so on, fixed me teaching this mistake made
by me,
in
case you are interested.
When you want, for example when image descriptions are
added, we
can
post the add-on on the website, and if this is joined with
ImageDescriber, we will fix things later. I say this since
maybe
complicated to join these add-ons, imo, at least currently,
since
the
two interfaces are very different. Anyway, if you want to
wait to
post
the add-on, this is OK.


Cheers

El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io escribió:
Sorry, another thought: I think that Larry could accept pull
request
if the features contained in ImageDescribed are properly
integrated in
online_ocr. For example, I don't know a good idea just to join
the two
add-ons, and I would understand that for any reason this could
be done
and they remain as independent add-ons. Please don't feel any
pressure
for our part (mentioning Robert and me).
Just for clarify

2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi, this is great news. I think that Larry may accept pull
requests,
since the add-on gui seems to be flexible, accepting
profiles,
integrated in NVDA settings, and accepting content recog
messages,
etc. Also, I think that creating pull request in this add-on
can be
easy, using the abstract class for plugins. So, imo, if
both you
agree, I would vote for adding just one joined add-on on the
website.
ImageDescriber could be a better name since it's more
generic,
though
I definitely would use Larry's UI and framework. What do you
think?

Cheers and thanks both for your work.

2019-03-22 11:25 GMT+01:00, Robert Hänggi
<aarjay.robert@gmail.com>:
Oliver, I'm very glad that you're open to collaboration.
I think you're joined efforts would lead to a great add-on.
Apart from that, it may help to avoid clones of 3rd party
modules
present in both add-ons and thus reducing storage and
load time.

I could perhaps also help with mouse related stuff (e.g.
recognition
of unknown pointer shapes automatically).

Cheers
Robert

On 22/03/2019, Oliver Edholm <oliver.edholm@gmail.com>
wrote:
Hi! It's the creator of the Image Describer add-on.
Looked a
little
bit
at
the code and it looks very well made.

I agree that we do similar things here. Yesterday I
added OCR
to
the
Image
Describer backend for example (without knowing about this
add-on).

I'm also working on a free Captcha Solver which seems to be
something
Larry
has also looked into.

Also regarding the example of utilizing more methods
from AI to
these
problems it's also something I've been working on. You
gave the
example
of
icon classification, I trained a classifier for this ~2
weeks
ago
which
intend to eventually add to the Image Describer add-on.

I'm definitely open for some sort of collaboration here.

On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:


Wouldn't AI conflict with the objectDescriber add-on?
I mean, two add-ons with essentially the same
functionality
is a
bit
strange.

However, I see certainly positive results when AI is
integrated.
Imagine a inaccessible application. A lot of times, it
does
not
only
use text but also symbols and it would be great if general
once
could
be recognized.
- disc icons
- cog wheel/ wrench icons
- transport controls: play/fast forward/rewind icons
- home icon
- triangles (for drop down menus)
- check boxes, radio buttons, folder icons (= browse for
folder)
and
so
on.

Additionally, simple rectangles which could indicate form
fields
and
edit boxes. Compare the auto form detection used in
Acrobat
Pro DC

(By the way, I make use (in my version of golden cursor)
changing
mouse icons to determine e.g. the boundaries of an edit
box or
scroll
bars or sliders.)

Another thing that I would like to see:
If one presses enter in a recognition result, the default
action
is
performed. So far so good but here are some possible
improvements:
Case 1, when the recognition result stays open.
- perform a new recognition and update the document to
reflect
possible
changes.
Case 2, when the navigator object changes and the document
closes.
- if the new navigator object is not a known class (e.g.
window or
system client or generic, without children) do as for
case 1.

Best
Robert



On 21/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:

control nvda shift i is assigned to readfeeds add-on, but
feel
free
to
assign it since readfeeds is an add-on, maintained and
initially
creadted
by
me (smile).

Enviado desde mi iPhone


El 21 mar 2019, a las 8:17, Larry Wang
<larry.wang.801@gmail.com>
escribió:

AI recognition is not hard to add since it does similar
operation
on
an
image, I have found several engines too. But we need
another two
gestures,

is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I unassigned?


On 2019/3/21 12:53, Noelia Ruiz wrote:
OK, let's wait some hours, and tonight or tomorrow I
will
try
to
post
the
add-on on the website if all is OK. For now I will
upload
the
dev
version
with one link, and when you declare the add-on as
stable we
will
post
the
link to stable version too and will register the
add-on in
translation
system.
I would like if you coul add AI recognition of
images, for
instance
if
they contain a man, a room, blond or brown hair, etc.
This is a wonderful work!

El 21/03/2019 a las 1:47, Larry Wang escribió:
Hi everyone,

I am happy to release version 0.10 of online ocr
addon.
Changes in this version include

Fix error using user's own api key in sougou API

Fix unknown panel in sougou API settings

Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon






Cheers,
Larry















Re: Online OCR addon #addonrequestreview

Larry Wang
 

Hi Noelia,

If Robert's issue is fixed, I think it is better to post 0.13.

I have a bitbucket account and I am willing to post versions by my self.

Actually this addon was on a private repo on bitbucket before I put it on github.

This is my profile page.

https://bitbucket.org/Rheinmetal/

On 2019/3/28 17:37, Noelia Ruiz wrote:
Hi, let us know if we should post 0.12 or 0.13 with fixed profile issue.
Also, for add-ons reviewed by me, if authors show Git experience, I
try to offer write access to addonFiles repo if they have a Bitbucket
account, so authors can update versions themselves on the website, if
they want. Otherwise, we can do it.
So, if you have a Bitbucket account and want write access to
addonFiles, let us know. Just be patient, since Bitbucket is not very
comfortable and provide this access can be hard and, at least in my
case, sometimes I need to retry it several times.

Cheers


2019-03-28 9:51 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi Noelia

1. Documentation problem mentioned above is corrected in 0.12

2. Access type is valid for Oliver's engine. That is because I provided
a proxy for users without access to google, by default it connects
directly to google .
3. I will look into how to use a checkbox. The checkbox in advanced
panel may be a good candidate.


On 2019/3/28 15:10, Noelia Ruiz wrote:
Also, the plugins based on Oliver's work seems not to need an api key.
Then I think that the combo box should be disabled, since the combo
box label refers to api key, but options correspond to proxy.
About the details supported by plugins like Azure Analizer, I think
that a list of checkbox may be better than a combo box.
Also, I saw that in some part imageInfo appears as not defined. This
is normal, since the add-on is very complex and have a lot of plugins.
Anyway, we can update the version and formatted documentation easily.

Cheers


El 27/03/2019 a las 23:05, Noelia Ruiz via Groups.Io escribió:
Hi, your suggestion sounds good. I don't use numeric keys and use
laptop key with bloc uppercase as NVDA key, so I don't have problems
with combinations, but I agree with you.

Cheers


El 27/03/2019 a las 22:16, Robert Hänggi escribió:
Hi Noelia,
I would swap the two gestures. NVDA+Alt for the normal functionality
and NVDA+Shift+Control for the clipboard because pressing three keys
is hard enough, no need for 4. Especially if you want to perform a
double gesture.

I wonder, shouldn't the NumPad be used, if available, after all, the
gestures for object navigation are situated there.
And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
(I've seen the latter one in the dragAndDrop addOn but I don't know if
that is still maintained).

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, another suggestion:
NVDA+control+shift are used, when pressing r, to object navigator, and
when pressing p, for clipboard images. And alt+nvda, for navigator
object in case of image descriptor, and with r for clipboard. This may
be inconsistent. Maybe better to use control+nvda+shift for navigator
object and nvda+alt for clipboard?

Cheers


El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
Hi again, the add-on is on the website. Anyway, I recommend you to
format a bit the documentation, for example, you may want to put The
first heading level 3 after level 1 to 2, and also fix lists, since
after the log option the list finishes, etc.
I have put asterisks at start, so that compatibility info, author and
download links appear in a list, as done with other add-ons on the
website.
Also, the link using addonFiles address is used, to avoid issues in
translations.
When you declare the add-on as stable, we can register it to be
translated. Now translators can start with documentation.
https://addons.nvda-project.org/addons/onlineOCR.en.html

Cheers

https://addons.nvda-project.org/addons/onlineOCR.en.html
El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
Hi, basic review results after adding image describers:

- License and copyright: Pass.
- Security: pass.
- Documentation: Pass.
- User experience: pass with comments.

Image describers can provide text results (not in browse mode), and
this has to be fixed. Also, when pressing the gesture twice, NVDA
can
report that there is active another recognizer.
Anyway, seing that author is listening and responding feed-back, and
that Oliver work has been used and mentioned in documentation, and
moreover the code shows a deep experience looking at NVDA relevant
code, and also that the main structure is done, and fail refers to
part of some features, and that this should be tested in different
languages, I think we can post it in the development section of
website to share widely in different communities, since the
add-on is
useful now and will be improved.
Here is my log. Since no objections are expressed, I will post the
add-on now, and this is the log for emage describer errors:

INFO -
external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(19:17:16.818):
(u'C:\\Users\\User\\Downloads\\ant.jpg',)
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}



El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
Yes, I have seen an error with text result in browse mode with the
image describer plugin. Anyway, I think it has a lot of feedback
and
just it could be added to development section of website, since the
major part is done, it has passed basic review, the quality in
general is good, and no major changes in documentation are going to
happen for now. If you dont have objections, I will post this in
about an hour for wide testing, and if minor version are released
they will be posted, but now translators can start to translate
documentation so that different communities can test the add-on
with
some quality and feedback, as a first filter.
Cheers""

Enviado desde mi iPhone

El 27 mar 2019, a las 17:43, Robert Hänggi
<aarjay.robert@gmail.com>
escribió:

Hi
I'm glad that the addOn makes such huge steps.

The browsable message doesn't seem to work for NVDA+Alt+P if I
swap
the gestures.

The description might need some formatting, at least some line
breaks.

That's the result with azure:

Categories: others_ outdoor_ text_sign
This image does not contain adult content This image contains racy
content
Dominant foreground color is Red. Dominant background color is
White.
Hex code of accent color is 000000. Dominant colors: White The
image
is not black and white.
Tags: screenshot design pixel vector typography
The image is not a clip-art. The image is not a lineDrawing
Descriptions: a close up of a device

(Note there is no racy content, LOL. It's the record button of
Audacity.)

Empty tags could be omitted.
The accent colour could perhaps be represented as RGB or expressed
with the NVDA colour descriptions.

Anyways, great work.

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, I will try to post the development version of this wonderful
add-on on the website after job, in the evening, in about 2
hours or
so.
Personally, I feel a deep emotion for this. Of course, this
doesn't
replace a human description where we can ask questions and so on.
I think this is an open door also for students in scientific
degrees,
where people can have a lot of problems and some of us have
listened
things like: "Why don't you try to drive a car?". Since some
things
can be identified by machine learning, or at least this can
help us,
and this is very important.
I have a lot of job and hope other reviewers can help with new
add-ons too.

Cheers


2019-03-27 16:15 GMT+01:00, Larry Wang
<larry.wang.801@gmail.com>:
Hi everyone,

Version 0.11 of online ocr addon is released.
Changes in this version include:
Change addon summary to online image describer
Added image description capability

NVDA+Alt+P Recognize current navigator object Then read
result. If
pressed
twice, open a virtual result document.

Control+Shift+NVDA+P Recognizes image in clipboard . Then read
result. If
pressed twice, open a virtual result document.
Here are three engines available.

### Machine Learning Engine by Oliver Edholm
It's a free engine gives description of an image.
If there is text inside it will do OCR on the image.
There are two settings for this engine.
* Language of result:
English by default. If you configure another language than
English, the
description could have translation issues because it's
automatically
generated by machine translation service.

Security:
* The images are sent to a script hosted on the Google Cloud
Platform for
analysis. After the analysis the image gets removed from the
server and
will never be seen again.

The author of this addon has setup a proxy on www.nvdacn.com for
users
who
cannot access google service access.
If you want to use this proxy please chose Use proxy on
www.nvdacn.com in
access type settings.

If you want to use your own key in the following two Microsoft
engines.
Please follow the guide in Microsoft Azure OCR section.
### Microsoft Azure Image Analyser
This engine extracts a rich set of visual features based on the
image
content.
This engine is english only by now.

Visual Features include:
Adult - detects if the image is pornographic in nature (depicts
nudity or
a
sex act). Sexually suggestive content is also detected.
Brands - detects various brands within an image, including the
approximate
location. The Brands argument is only available in English.
Categories - categorizes image content according to a taxonomy
defined in
documentation.
Color - determines the accent color, dominant color, and whether
an image
is black&white.
Description - describes the image content with a complete
sentence
in
supported languages.
Faces - detects if faces are present. If present, generate
coordinates,
gender and age.
ImageType - detects if image is clip art or a line drawing.
Objects - detects various objects within an image, including the
approximate location. The Objects argument is only available in
English.
Tags - tags the image with a detailed list of words related
to the
image
content.

Some features also provide additional details:

Celebrities - identifies celebrities if detected in the image.
Landmarks - identifies landmarks if detected in the image.

### Microsoft Azure Image describer

This engine generates a description of an image in human
readable
language
with complete sentences. The description is based on a
collection of
content tags, which are also returned by the operation. More
than
one
description can be generated for each image. Descriptions are
ordered by
their confidence score.
There are two settings for this engine.
* Language
The language in which the service will return a description
of the
image.
English by default.

* Max Candidates
Maximum number of candidate descriptions to be returned. The
default is
1.


Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon






Cheers,
Larry



On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz <nrm1977@gmail.com>
wrote:

In case it's useful, more suggestion:
- In the documentation presented in input mode for
NVDA+shift+control+r
announces that recognizes the content and opens a virtual
result,
but
this is configurable. This may be mentioned indicating that the
result
could be spoken and braillified or presented in a virtual
document.
- Regarding the code, I have seen that sometimes it's used
something
like:
if something: return
else...
But I think in some cases else after return is not needed,
since
if the
previous condition exists, the script return and no more can
happen. I
made this mistake years ago and Mesar Hameed, a great person
and
developer, who created initially the scripts used for the
translation
system and make add-ons translatable by NVDA's translators
exchanging
messages between add-ons stored on Bitbucket team account
and the
translators repository, also first creator of guidelines for
reviewers
and authors and so on, fixed me teaching this mistake made
by me,
in
case you are interested.
When you want, for example when image descriptions are
added, we
can
post the add-on on the website, and if this is joined with
ImageDescriber, we will fix things later. I say this since
maybe
complicated to join these add-ons, imo, at least currently,
since
the
two interfaces are very different. Anyway, if you want to
wait to
post
the add-on, this is OK.


Cheers

El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io escribió:
Sorry, another thought: I think that Larry could accept pull
request
if the features contained in ImageDescribed are properly
integrated in
online_ocr. For example, I don't know a good idea just to join
the two
add-ons, and I would understand that for any reason this could
be done
and they remain as independent add-ons. Please don't feel any
pressure
for our part (mentioning Robert and me).
Just for clarify

2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi, this is great news. I think that Larry may accept pull
requests,
since the add-on gui seems to be flexible, accepting
profiles,
integrated in NVDA settings, and accepting content recog
messages,
etc. Also, I think that creating pull request in this add-on
can be
easy, using the abstract class for plugins. So, imo, if
both you
agree, I would vote for adding just one joined add-on on the
website.
ImageDescriber could be a better name since it's more
generic,
though
I definitely would use Larry's UI and framework. What do you
think?

Cheers and thanks both for your work.

2019-03-22 11:25 GMT+01:00, Robert Hänggi
<aarjay.robert@gmail.com>:
Oliver, I'm very glad that you're open to collaboration.
I think you're joined efforts would lead to a great add-on.
Apart from that, it may help to avoid clones of 3rd party
modules
present in both add-ons and thus reducing storage and
load time.

I could perhaps also help with mouse related stuff (e.g.
recognition
of unknown pointer shapes automatically).

Cheers
Robert

On 22/03/2019, Oliver Edholm <oliver.edholm@gmail.com>
wrote:
Hi! It's the creator of the Image Describer add-on.
Looked a
little
bit
at
the code and it looks very well made.

I agree that we do similar things here. Yesterday I
added OCR
to
the
Image
Describer backend for example (without knowing about this
add-on).

I'm also working on a free Captcha Solver which seems to be
something
Larry
has also looked into.

Also regarding the example of utilizing more methods
from AI to
these
problems it's also something I've been working on. You
gave the
example
of
icon classification, I trained a classifier for this ~2
weeks
ago
which
intend to eventually add to the Image Describer add-on.

I'm definitely open for some sort of collaboration here.

On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:


Wouldn't AI conflict with the objectDescriber add-on?
I mean, two add-ons with essentially the same
functionality
is a
bit
strange.

However, I see certainly positive results when AI is
integrated.
Imagine a inaccessible application. A lot of times, it
does
not
only
use text but also symbols and it would be great if general
once
could
be recognized.
- disc icons
- cog wheel/ wrench icons
- transport controls: play/fast forward/rewind icons
- home icon
- triangles (for drop down menus)
- check boxes, radio buttons, folder icons (= browse for
folder)
and
so
on.

Additionally, simple rectangles which could indicate form
fields
and
edit boxes. Compare the auto form detection used in
Acrobat
Pro DC

(By the way, I make use (in my version of golden cursor)
changing
mouse icons to determine e.g. the boundaries of an edit
box or
scroll
bars or sliders.)

Another thing that I would like to see:
If one presses enter in a recognition result, the default
action
is
performed. So far so good but here are some possible
improvements:
Case 1, when the recognition result stays open.
- perform a new recognition and update the document to
reflect
possible
changes.
Case 2, when the navigator object changes and the document
closes.
- if the new navigator object is not a known class (e.g.
window or
system client or generic, without children) do as for
case 1.

Best
Robert



On 21/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:

control nvda shift i is assigned to readfeeds add-on, but
feel
free
to
assign it since readfeeds is an add-on, maintained and
initially
creadted
by
me (smile).

Enviado desde mi iPhone


El 21 mar 2019, a las 8:17, Larry Wang
<larry.wang.801@gmail.com>
escribió:

AI recognition is not hard to add since it does similar
operation
on
an
image, I have found several engines too. But we need
another two
gestures,

is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I unassigned?


On 2019/3/21 12:53, Noelia Ruiz wrote:
OK, let's wait some hours, and tonight or tomorrow I
will
try
to
post
the
add-on on the website if all is OK. For now I will
upload
the
dev
version
with one link, and when you declare the add-on as
stable we
will
post
the
link to stable version too and will register the
add-on in
translation
system.
I would like if you coul add AI recognition of
images, for
instance
if
they contain a man, a room, blond or brown hair, etc.
This is a wonderful work!

El 21/03/2019 a las 1:47, Larry Wang escribió:
Hi everyone,

I am happy to release version 0.10 of online ocr
addon.
Changes in this version include

Fix error using user's own api key in sougou API

Fix unknown panel in sougou API settings

Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon






Cheers,
Larry















Re: image describer

Noelia Ruiz
 

I get this error:

INFO - external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(11:08:21.948):
(u'C:\\Users\\USUARIO\\Downloads\\ant.jpg',)
ERROR - stderr (11:08:39.793):
Exception in thread Thread-12:
Traceback (most recent call last):
File "threading.pyo", line 801, in __bootstrap_inner
File "threading.pyo", line 754, in run
File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\winHttp.py",
line 115, in postContent
File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\winHttp.py",
line 109, in doHTTPRequest
File "C:\Users\USUARIO\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\onlineOCRHandler.py",
line 521, in callback
File "ui.pyo", line 67, in message
File "braille.pyo", line 1795, in message
File "braille.pyo", line 1809, in _resetMessageTimer
File "wx\core.pyo", line 3284, in __init__
File "wx\core.pyo", line 3305, in Start
wxAssertionError: C++ assertion "wxThread::IsMain()" failed at
..\..\src\common\timerimpl.cpp(60) in wxTimerImpl::Start(): timer can
only be started from the main thread


2019-03-28 11:06 GMT+01:00, Shaun Everiss <sm.everiss@gmail.com>:

Hi.

PUlled the image describer from screanreader.ai, and its fine, but will
it be added to the updater for updating.

And thats another thing, there should really be a way to add addons to
some database file that the updater downloads, reason being, right now
every time you update addons and if these are new addons you need to
release another version of the updater, I know a lot of updaters
download updates  of all the updates what they are etc in some ini or
database list file then update based on that.

Just wandering.






image describer

 

Hi.

PUlled the image describer from screanreader.ai, and its fine, but will it be added to the updater for updating.

And thats another thing, there should really be a way to add addons to some database file that the updater downloads, reason being, right now every time you update addons and if these are new addons you need to release another version of the updater, I know a lot of updaters download updates  of all the updates what they are etc in some ini or database list file then update based on that.

Just wandering.


Re: Online OCR addon #addonrequestreview

Noelia Ruiz
 

We will post 0.13 on the website if you don't release 0.14. I didn't
see this before answering.
I can update the add-on and documentation on the website after job, in
about 6 or 7 hours.


2019-03-28 10:33 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:

Hi all,

0.13 version of online image describer addon is released

Changes in this version include

Make sure that the add-on works when reloading the plug-ins without
restart (NVDA+Control+F3)

Here is the download link:

https://github.com/larry801/online_ocr/releases/download/0.13-dev/onlineOCR-0.13-sdev.nvda-addon)

Cheers,

Larry

On 2019/3/28 16:43, Larry Wang wrote:
Hi all,

0.12 version of online image describer addon is released

Changes in this version include



Re: Online OCR addon #addonrequestreview

Noelia Ruiz
 

Hi, let us know if we should post 0.12 or 0.13 with fixed profile issue.
Also, for add-ons reviewed by me, if authors show Git experience, I
try to offer write access to addonFiles repo if they have a Bitbucket
account, so authors can update versions themselves on the website, if
they want. Otherwise, we can do it.
So, if you have a Bitbucket account and want write access to
addonFiles, let us know. Just be patient, since Bitbucket is not very
comfortable and provide this access can be hard and, at least in my
case, sometimes I need to retry it several times.

Cheers


2019-03-28 9:51 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:

Hi Noelia

1. Documentation problem mentioned above is corrected in 0.12

2. Access type is valid for Oliver's engine. That is because I provided
a proxy for users without access to google, by default it connects
directly to google .
3. I will look into how to use a checkbox. The checkbox in advanced
panel may be a good candidate.


On 2019/3/28 15:10, Noelia Ruiz wrote:
Also, the plugins based on Oliver's work seems not to need an api key.
Then I think that the combo box should be disabled, since the combo
box label refers to api key, but options correspond to proxy.
About the details supported by plugins like Azure Analizer, I think
that a list of checkbox may be better than a combo box.
Also, I saw that in some part imageInfo appears as not defined. This
is normal, since the add-on is very complex and have a lot of plugins.
Anyway, we can update the version and formatted documentation easily.

Cheers


El 27/03/2019 a las 23:05, Noelia Ruiz via Groups.Io escribió:
Hi, your suggestion sounds good. I don't use numeric keys and use
laptop key with bloc uppercase as NVDA key, so I don't have problems
with combinations, but I agree with you.

Cheers


El 27/03/2019 a las 22:16, Robert Hänggi escribió:
Hi Noelia,
I would swap the two gestures. NVDA+Alt for the normal functionality
and NVDA+Shift+Control for the clipboard because pressing three keys
is hard enough, no need for 4. Especially if you want to perform a
double gesture.

I wonder, shouldn't the NumPad be used, if available, after all, the
gestures for object navigation are situated there.
And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
(I've seen the latter one in the dragAndDrop addOn but I don't know if
that is still maintained).

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, another suggestion:
NVDA+control+shift are used, when pressing r, to object navigator, and
when pressing p, for clipboard images. And alt+nvda, for navigator
object in case of image descriptor, and with r for clipboard. This may
be inconsistent. Maybe better to use control+nvda+shift for navigator
object and nvda+alt for clipboard?

Cheers


El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
Hi again, the add-on is on the website. Anyway, I recommend you to
format a bit the documentation, for example, you may want to put The
first heading level 3 after level 1 to 2, and also fix lists, since
after the log option the list finishes, etc.
I have put asterisks at start, so that compatibility info, author and
download links appear in a list, as done with other add-ons on the
website.
Also, the link using addonFiles address is used, to avoid issues in
translations.
When you declare the add-on as stable, we can register it to be
translated. Now translators can start with documentation.
https://addons.nvda-project.org/addons/onlineOCR.en.html

Cheers

https://addons.nvda-project.org/addons/onlineOCR.en.html
El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
Hi, basic review results after adding image describers:

- License and copyright: Pass.
- Security: pass.
- Documentation: Pass.
- User experience: pass with comments.

Image describers can provide text results (not in browse mode), and
this has to be fixed. Also, when pressing the gesture twice, NVDA
can
report that there is active another recognizer.
Anyway, seing that author is listening and responding feed-back, and
that Oliver work has been used and mentioned in documentation, and
moreover the code shows a deep experience looking at NVDA relevant
code, and also that the main structure is done, and fail refers to
part of some features, and that this should be tested in different
languages, I think we can post it in the development section of
website to share widely in different communities, since the
add-on is
useful now and will be improved.
Here is my log. Since no objections are expressed, I will post the
add-on now, and this is the log for emage describer errors:

INFO -
external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(19:17:16.818):
(u'C:\\Users\\User\\Downloads\\ant.jpg',)
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:09.683):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:13.302):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback


(19:18:39.265):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing',
u'food']},
u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}



El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
Yes, I have seen an error with text result in browse mode with the
image describer plugin. Anyway, I think it has a lot of feedback
and
just it could be added to development section of website, since the
major part is done, it has passed basic review, the quality in
general is good, and no major changes in documentation are going to
happen for now. If you dont have objections, I will post this in
about an hour for wide testing, and if minor version are released
they will be posted, but now translators can start to translate
documentation so that different communities can test the add-on
with
some quality and feedback, as a first filter.
Cheers""

Enviado desde mi iPhone

El 27 mar 2019, a las 17:43, Robert Hänggi
<aarjay.robert@gmail.com>
escribió:

Hi
I'm glad that the addOn makes such huge steps.

The browsable message doesn't seem to work for NVDA+Alt+P if I
swap
the gestures.

The description might need some formatting, at least some line
breaks.

That's the result with azure:

Categories: others_ outdoor_ text_sign
This image does not contain adult content This image contains racy
content
Dominant foreground color is Red. Dominant background color is
White.
Hex code of accent color is 000000. Dominant colors: White The
image
is not black and white.
Tags: screenshot design pixel vector typography
The image is not a clip-art. The image is not a lineDrawing
Descriptions: a close up of a device

(Note there is no racy content, LOL. It's the record button of
Audacity.)

Empty tags could be omitted.
The accent colour could perhaps be represented as RGB or expressed
with the NVDA colour descriptions.

Anyways, great work.

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, I will try to post the development version of this wonderful
add-on on the website after job, in the evening, in about 2
hours or
so.
Personally, I feel a deep emotion for this. Of course, this
doesn't
replace a human description where we can ask questions and so on.
I think this is an open door also for students in scientific
degrees,
where people can have a lot of problems and some of us have
listened
things like: "Why don't you try to drive a car?". Since some
things
can be identified by machine learning, or at least this can
help us,
and this is very important.
I have a lot of job and hope other reviewers can help with new
add-ons too.

Cheers


2019-03-27 16:15 GMT+01:00, Larry Wang
<larry.wang.801@gmail.com>:
Hi everyone,

Version 0.11 of online ocr addon is released.
Changes in this version include:
Change addon summary to online image describer
Added image description capability

NVDA+Alt+P Recognize current navigator object Then read
result. If
pressed
twice, open a virtual result document.

Control+Shift+NVDA+P Recognizes image in clipboard . Then read
result. If
pressed twice, open a virtual result document.
Here are three engines available.

### Machine Learning Engine by Oliver Edholm
It's a free engine gives description of an image.
If there is text inside it will do OCR on the image.
There are two settings for this engine.
* Language of result:
English by default. If you configure another language than
English, the
description could have translation issues because it's
automatically
generated by machine translation service.

Security:
* The images are sent to a script hosted on the Google Cloud
Platform for
analysis. After the analysis the image gets removed from the
server and
will never be seen again.

The author of this addon has setup a proxy on www.nvdacn.com for
users
who
cannot access google service access.
If you want to use this proxy please chose Use proxy on
www.nvdacn.com in
access type settings.

If you want to use your own key in the following two Microsoft
engines.
Please follow the guide in Microsoft Azure OCR section.
### Microsoft Azure Image Analyser
This engine extracts a rich set of visual features based on the
image
content.
This engine is english only by now.

Visual Features include:
Adult - detects if the image is pornographic in nature (depicts
nudity or
a
sex act). Sexually suggestive content is also detected.
Brands - detects various brands within an image, including the
approximate
location. The Brands argument is only available in English.
Categories - categorizes image content according to a taxonomy
defined in
documentation.
Color - determines the accent color, dominant color, and whether
an image
is black&white.
Description - describes the image content with a complete
sentence
in
supported languages.
Faces - detects if faces are present. If present, generate
coordinates,
gender and age.
ImageType - detects if image is clip art or a line drawing.
Objects - detects various objects within an image, including the
approximate location. The Objects argument is only available in
English.
Tags - tags the image with a detailed list of words related
to the
image
content.

Some features also provide additional details:

Celebrities - identifies celebrities if detected in the image.
Landmarks - identifies landmarks if detected in the image.

### Microsoft Azure Image describer

This engine generates a description of an image in human
readable
language
with complete sentences. The description is based on a
collection of
content tags, which are also returned by the operation. More
than
one
description can be generated for each image. Descriptions are
ordered by
their confidence score.
There are two settings for this engine.
* Language
The language in which the service will return a description
of the
image.
English by default.

* Max Candidates
Maximum number of candidate descriptions to be returned. The
default is
1.


Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon






Cheers,
Larry



On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz <nrm1977@gmail.com>
wrote:

In case it's useful, more suggestion:
- In the documentation presented in input mode for
NVDA+shift+control+r
announces that recognizes the content and opens a virtual
result,
but
this is configurable. This may be mentioned indicating that the
result
could be spoken and braillified or presented in a virtual
document.
- Regarding the code, I have seen that sometimes it's used
something
like:
if something: return
else...
But I think in some cases else after return is not needed,
since
if the
previous condition exists, the script return and no more can
happen. I
made this mistake years ago and Mesar Hameed, a great person
and
developer, who created initially the scripts used for the
translation
system and make add-ons translatable by NVDA's translators
exchanging
messages between add-ons stored on Bitbucket team account
and the
translators repository, also first creator of guidelines for
reviewers
and authors and so on, fixed me teaching this mistake made
by me,
in
case you are interested.
When you want, for example when image descriptions are
added, we
can
post the add-on on the website, and if this is joined with
ImageDescriber, we will fix things later. I say this since
maybe
complicated to join these add-ons, imo, at least currently,
since
the
two interfaces are very different. Anyway, if you want to
wait to
post
the add-on, this is OK.


Cheers

El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io escribió:
Sorry, another thought: I think that Larry could accept pull
request
if the features contained in ImageDescribed are properly
integrated in
online_ocr. For example, I don't know a good idea just to join
the two
add-ons, and I would understand that for any reason this could
be done
and they remain as independent add-ons. Please don't feel any
pressure
for our part (mentioning Robert and me).
Just for clarify

2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi, this is great news. I think that Larry may accept pull
requests,
since the add-on gui seems to be flexible, accepting
profiles,
integrated in NVDA settings, and accepting content recog
messages,
etc. Also, I think that creating pull request in this add-on
can be
easy, using the abstract class for plugins. So, imo, if
both you
agree, I would vote for adding just one joined add-on on the
website.
ImageDescriber could be a better name since it's more
generic,
though
I definitely would use Larry's UI and framework. What do you
think?

Cheers and thanks both for your work.

2019-03-22 11:25 GMT+01:00, Robert Hänggi
<aarjay.robert@gmail.com>:
Oliver, I'm very glad that you're open to collaboration.
I think you're joined efforts would lead to a great add-on.
Apart from that, it may help to avoid clones of 3rd party
modules
present in both add-ons and thus reducing storage and
load time.

I could perhaps also help with mouse related stuff (e.g.
recognition
of unknown pointer shapes automatically).

Cheers
Robert

On 22/03/2019, Oliver Edholm <oliver.edholm@gmail.com>
wrote:
Hi! It's the creator of the Image Describer add-on.
Looked a
little
bit
at
the code and it looks very well made.

I agree that we do similar things here. Yesterday I
added OCR
to
the
Image
Describer backend for example (without knowing about this
add-on).

I'm also working on a free Captcha Solver which seems to be
something
Larry
has also looked into.

Also regarding the example of utilizing more methods
from AI to
these
problems it's also something I've been working on. You
gave the
example
of
icon classification, I trained a classifier for this ~2
weeks
ago
which
intend to eventually add to the Image Describer add-on.

I'm definitely open for some sort of collaboration here.

On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:


Wouldn't AI conflict with the objectDescriber add-on?
I mean, two add-ons with essentially the same
functionality
is a
bit
strange.

However, I see certainly positive results when AI is
integrated.
Imagine a inaccessible application. A lot of times, it
does
not
only
use text but also symbols and it would be great if general
once
could
be recognized.
- disc icons
- cog wheel/ wrench icons
- transport controls: play/fast forward/rewind icons
- home icon
- triangles (for drop down menus)
- check boxes, radio buttons, folder icons (= browse for
folder)
and
so
on.

Additionally, simple rectangles which could indicate form
fields
and
edit boxes. Compare the auto form detection used in
Acrobat
Pro DC

(By the way, I make use (in my version of golden cursor)
changing
mouse icons to determine e.g. the boundaries of an edit
box or
scroll
bars or sliders.)

Another thing that I would like to see:
If one presses enter in a recognition result, the default
action
is
performed. So far so good but here are some possible
improvements:
Case 1, when the recognition result stays open.
- perform a new recognition and update the document to
reflect
possible
changes.
Case 2, when the navigator object changes and the document
closes.
- if the new navigator object is not a known class (e.g.
window or
system client or generic, without children) do as for
case 1.

Best
Robert



On 21/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:

control nvda shift i is assigned to readfeeds add-on, but
feel
free
to
assign it since readfeeds is an add-on, maintained and
initially
creadted
by
me (smile).

Enviado desde mi iPhone


El 21 mar 2019, a las 8:17, Larry Wang
<larry.wang.801@gmail.com>
escribió:

AI recognition is not hard to add since it does similar
operation
on
an
image, I have found several engines too. But we need
another two
gestures,

is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I unassigned?


On 2019/3/21 12:53, Noelia Ruiz wrote:
OK, let's wait some hours, and tonight or tomorrow I
will
try
to
post
the
add-on on the website if all is OK. For now I will
upload
the
dev
version
with one link, and when you declare the add-on as
stable we
will
post
the
link to stable version too and will register the
add-on in
translation
system.
I would like if you coul add AI recognition of
images, for
instance
if
they contain a man, a room, blond or brown hair, etc.
This is a wonderful work!

El 21/03/2019 a las 1:47, Larry Wang escribió:
Hi everyone,

I am happy to release version 0.10 of online ocr
addon.
Changes in this version include

Fix error using user's own api key in sougou API

Fix unknown panel in sougou API settings

Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon







Cheers,
Larry
































Re: Online OCR addon #addonrequestreview

Larry Wang
 

Hi all,

0.13 version of online image describer addon is released

Changes in this version include

Make sure that the add-on works when reloading the plug-ins without
restart (NVDA+Control+F3)

Here is the download link:

https://github.com/larry801/online_ocr/releases/download/0.13-dev/onlineOCR-0.13-sdev.nvda-addon)

Cheers,

Larry

On 2019/3/28 16:43, Larry Wang wrote:
Hi all,

0.12 version of online image describer addon is released

Changes in this version include


Re: Online OCR addon #addonrequestreview

Larry Wang
 

Hi Robert,

Thanks for your advise.
1. There is another recognition ongoing may be related with missing clean up after timeout.
2. My except clause in recognition function is too board, failing early really help a lot with debugging. The reason why I catch them all was that if user run this addon on stable NVDA, there is no error sound when exceptions occur. Thus would make people think the addon does not respond.

3. The messages were split for reuse at first. But after refactoring engine code I find that your suggestion would make it easier to maintain  most of the messages.  Only a few messages can be reused such as no image message in __init__.py

4. As for gestures , I have fixed inconsistency you mentioned. NVDA+Alt for object navigation and Control+NVDA+Shift for clipboard. Repeating gestures may not be too hard to do. Just hold down control alt and NVDA then press p or r twice. May be you are trying to press four keys twice at once?

I mapped NVDA+numpad3 to "Moves to the previous object in a flattened view of the object navigation hierarchy" NVDA+numpad9 to next. So I did not think of using numpad.

5. version 0.13 fixed the config switch issue

On 2019/3/28 15:39, Robert Hänggi wrote:
Hi

There are plenty of errors going on and the messages aren't that
helpful, i.e. they are sometimes not related to the exception that
caused them.
I wouldn't generalize too much.
1. put the messages where they belong, not e.g. "ui.message(failed_message)
2. treat different errors in separate except clauses with the
appropriate message. Also, keep the try clause brief.
Example for both:

try:
ocrResult = self.extract_text(result)
if ocrResult.isspace():
# Translators: Reported when recognition result is empty
ocrResult = _(u"blank. There may be no text on this image.")
resultText = result_prefix + ocrResult
if config.conf[self.configSectionName]["copyToClipboard"]:
import api
api.copyToClip(resultText)
if self.text_result:
ui.message(resultText)
else:
self._onResult(LinesWordsResult(self.convert_to_line_result_format(result),
imageInfo))
except Exception as e:
log.error(e)
log.error(result)
ui.message(failed_message)
finally:
self._onResult = None

This particular section from onlineOCRHandler.py causes problems
because "imageInfo" is not defined in the function:

ERROR - external:globalPlugins.onlineOCR.onlineOCRHandler.BaseRecognizer.callback
(08:27:15.871):
global name 'imageInfo' is not defined
ERROR - external:globalPlugins.onlineOCR.onlineOCRHandler.BaseRecognizer.callback
(08:27:15.888):
{u'SearchablePDFURL': u'Searchable PDF not generated as it was not
requested.', u'OCRExitCode': 1, u'IsErroredOnProcessing': False,
u'ParsedResults': [{u'FileParseExitCode': 1, u'TextOverlay':
{u'HasOverlay': True, u'Lines':
(snip)

3. Make sure that the add-on works when reloading the plug-ins without
restart (NVDA+Control+F3)
I get the following when doing so:

ERROR - extensionPoints.Action.notify (07:59:08.701):
Error running handler <bound method ?.handlePostConfigProfileSwitch of
<class 'globalPlugins.onlineOCR.OnlineImageDescriberHandler.OnlineImageDescriberHandler'
,>> for <extensionPoints.Action object at 0x02AAE190>
Traceback (most recent call last):
File "extensionPoints\__init__.pyc", line 47, in notify
File "extensionPoints\util.pyc", line 185, in callWithSupportedKwargs
File "C:\Users\Robert
Hänggi\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\abstractEngine.py",
line 175, in handlePostConfigProfileSwitch
File "C:\Users\Robert
Hänggi\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\abstractEngine.py",
line 312, in loadSettings
File "baseObject.pyc", line 31, in __get__
File "C:\Users\Robert
Hänggi\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\imageDescribers\azureAnalyse.py",
line 95, in _get_supportedSettings
File "C:\Users\Robert
Hänggi\AppData\Roaming\nvda\addons\onlineOCR\globalPlugins\onlineOCR\OnlineImageDescriberHandler.py",
line 550, in AccessTypeSetting
AttributeError: 'NoneType' object has no attribute 'StringSettings'

I think getting the trace back during development is better than
catching all exceptions and just logging them, especially if the
errors are nested.

Of course, that's my subjective opinion and I don't want to criticise
your programming style, it is probably better than mine.

Cheers
Robert


On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, your suggestion sounds good. I don't use numeric keys and use laptop
key with bloc uppercase as NVDA key, so I don't have problems with
combinations, but I agree with you.

Cheers


El 27/03/2019 a las 22:16, Robert Hänggi escribió:
Hi Noelia,
I would swap the two gestures. NVDA+Alt for the normal functionality
and NVDA+Shift+Control for the clipboard because pressing three keys
is hard enough, no need for 4. Especially if you want to perform a
double gesture.

I wonder, shouldn't the NumPad be used, if available, after all, the
gestures for object navigation are situated there.
And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
(I've seen the latter one in the dragAndDrop addOn but I don't know if
that is still maintained).

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, another suggestion:
NVDA+control+shift are used, when pressing r, to object navigator, and
when pressing p, for clipboard images. And alt+nvda, for navigator
object in case of image descriptor, and with r for clipboard. This may
be inconsistent. Maybe better to use control+nvda+shift for navigator
object and nvda+alt for clipboard?

Cheers


El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
Hi again, the add-on is on the website. Anyway, I recommend you to
format a bit the documentation, for example, you may want to put The
first heading level 3 after level 1 to 2, and also fix lists, since
after the log option the list finishes, etc.
I have put asterisks at start, so that compatibility info, author and
download links appear in a list, as done with other add-ons on the
website.
Also, the link using addonFiles address is used, to avoid issues in
translations.
When you declare the add-on as stable, we can register it to be
translated. Now translators can start with documentation.
https://addons.nvda-project.org/addons/onlineOCR.en.html

Cheers

https://addons.nvda-project.org/addons/onlineOCR.en.html
El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
Hi, basic review results after adding image describers:

- License and copyright: Pass.
- Security: pass.
- Documentation: Pass.
- User experience: pass with comments.

Image describers can provide text results (not in browse mode), and
this has to be fixed. Also, when pressing the gesture twice, NVDA can
report that there is active another recognizer.
Anyway, seing that author is listening and responding feed-back, and
that Oliver work has been used and mentioned in documentation, and
moreover the code shows a deep experience looking at NVDA relevant
code, and also that the main structure is done, and fail refers to
part of some features, and that this should be tested in different
languages, I think we can post it in the development section of
website to share widely in different communities, since the add-on is
useful now and will be improved.
Here is my log. Since no objections are expressed, I will post the
add-on now, and this is the log for emage describer errors:

INFO -
external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(19:17:16.818):
(u'C:\\Users\\User\\Downloads\\ant.jpg',)
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:09.683):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:09.683):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing', u'food']},
u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:13.302):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:13.302):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing', u'food']},
u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:39.265):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:39.265):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing', u'food']},
u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}



El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
Yes, I have seen an error with text result in browse mode with the
image describer plugin. Anyway, I think it has a lot of feedback and
just it could be added to development section of website, since the
major part is done, it has passed basic review, the quality in
general is good, and no major changes in documentation are going to
happen for now. If you dont have objections, I will post this in
about an hour for wide testing, and if minor version are released
they will be posted, but now translators can start to translate
documentation so that different communities can test the add-on with
some quality and feedback, as a first filter.
Cheers""

Enviado desde mi iPhone

El 27 mar 2019, a las 17:43, Robert Hänggi <aarjay.robert@gmail.com>
escribió:

Hi
I'm glad that the addOn makes such huge steps.

The browsable message doesn't seem to work for NVDA+Alt+P if I swap
the gestures.

The description might need some formatting, at least some line
breaks.

That's the result with azure:

Categories: others_ outdoor_ text_sign
This image does not contain adult content This image contains racy
content
Dominant foreground color is Red. Dominant background color is
White.
Hex code of accent color is 000000. Dominant colors: White The image
is not black and white.
Tags: screenshot design pixel vector typography
The image is not a clip-art. The image is not a lineDrawing
Descriptions: a close up of a device

(Note there is no racy content, LOL. It's the record button of
Audacity.)

Empty tags could be omitted.
The accent colour could perhaps be represented as RGB or expressed
with the NVDA colour descriptions.

Anyways, great work.

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, I will try to post the development version of this wonderful
add-on on the website after job, in the evening, in about 2 hours
or
so.
Personally, I feel a deep emotion for this. Of course, this doesn't
replace a human description where we can ask questions and so on.
I think this is an open door also for students in scientific
degrees,
where people can have a lot of problems and some of us have
listened
things like: "Why don't you try to drive a car?". Since some things
can be identified by machine learning, or at least this can help
us,
and this is very important.
I have a lot of job and hope other reviewers can help with new
add-ons too.

Cheers


2019-03-27 16:15 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi everyone,

Version 0.11 of online ocr addon is released.
Changes in this version include:
Change addon summary to online image describer
Added image description capability

NVDA+Alt+P Recognize current navigator object Then read result. If
pressed
twice, open a virtual result document.

Control+Shift+NVDA+P Recognizes image in clipboard . Then read
result. If
pressed twice, open a virtual result document.
Here are three engines available.

### Machine Learning Engine by Oliver Edholm
It's a free engine gives description of an image.
If there is text inside it will do OCR on the image.
There are two settings for this engine.
* Language of result:
English by default. If you configure another language than
English, the
description could have translation issues because it's
automatically
generated by machine translation service.

Security:
* The images are sent to a script hosted on the Google Cloud
Platform for
analysis. After the analysis the image gets removed from the
server and
will never be seen again.

The author of this addon has setup a proxy on www.nvdacn.com for
users
who
cannot access google service access.
If you want to use this proxy please chose Use proxy on
www.nvdacn.com in
access type settings.

If you want to use your own key in the following two Microsoft
engines.
Please follow the guide in Microsoft Azure OCR section.
### Microsoft Azure Image Analyser
This engine extracts a rich set of visual features based on the
image
content.
This engine is english only by now.

Visual Features include:
Adult - detects if the image is pornographic in nature (depicts
nudity or
a
sex act). Sexually suggestive content is also detected.
Brands - detects various brands within an image, including the
approximate
location. The Brands argument is only available in English.
Categories - categorizes image content according to a taxonomy
defined in
documentation.
Color - determines the accent color, dominant color, and whether
an image
is black&white.
Description - describes the image content with a complete sentence
in
supported languages.
Faces - detects if faces are present. If present, generate
coordinates,
gender and age.
ImageType - detects if image is clip art or a line drawing.
Objects - detects various objects within an image, including the
approximate location. The Objects argument is only available in
English.
Tags - tags the image with a detailed list of words related to the
image
content.

Some features also provide additional details:

Celebrities - identifies celebrities if detected in the image.
Landmarks - identifies landmarks if detected in the image.

### Microsoft Azure Image describer

This engine generates a description of an image in human readable
language
with complete sentences. The description is based on a collection
of
content tags, which are also returned by the operation. More than
one
description can be generated for each image. Descriptions are
ordered by
their confidence score.
There are two settings for this engine.
* Language
The language in which the service will return a description of the
image.
English by default.

* Max Candidates
Maximum number of candidate descriptions to be returned. The
default is
1.


Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon




Cheers,
Larry



On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz <nrm1977@gmail.com>
wrote:

In case it's useful, more suggestion:
- In the documentation presented in input mode for
NVDA+shift+control+r
announces that recognizes the content and opens a virtual result,
but
this is configurable. This may be mentioned indicating that the
result
could be spoken and braillified or presented in a virtual
document.
- Regarding the code, I have seen that sometimes it's used
something
like:
if something: return
else...
But I think in some cases else after return is not needed, since
if the
previous condition exists, the script return and no more can
happen. I
made this mistake years ago and Mesar Hameed, a great person and
developer, who created initially the scripts used for the
translation
system and make add-ons translatable by NVDA's translators
exchanging
messages between add-ons stored on Bitbucket team account and the
translators repository, also first creator of guidelines for
reviewers
and authors and so on, fixed me teaching this mistake made by me,
in
case you are interested.
When you want, for example when image descriptions are added, we
can
post the add-on on the website, and if this is joined with
ImageDescriber, we will fix things later. I say this since maybe
complicated to join these add-ons, imo, at least currently, since
the
two interfaces are very different. Anyway, if you want to wait to
post
the add-on, this is OK.


Cheers

El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io escribió:
Sorry, another thought: I think that Larry could accept pull
request
if the features contained in ImageDescribed are properly
integrated in
online_ocr. For example, I don't know a good idea just to join
the two
add-ons, and I would understand that for any reason this could
be done
and they remain as independent add-ons. Please don't feel any
pressure
for our part (mentioning Robert and me).
Just for clarify

2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi, this is great news. I think that Larry may accept pull
requests,
since the add-on gui seems to be flexible, accepting profiles,
integrated in NVDA settings, and accepting content recog
messages,
etc. Also, I think that creating pull request in this add-on
can be
easy, using the abstract class for plugins. So, imo, if both
you
agree, I would vote for adding just one joined add-on on the
website.
ImageDescriber could be a better name since it's more generic,
though
I definitely would use Larry's UI and framework. What do you
think?

Cheers and thanks both for your work.

2019-03-22 11:25 GMT+01:00, Robert Hänggi
<aarjay.robert@gmail.com>:
Oliver, I'm very glad that you're open to collaboration.
I think you're joined efforts would lead to a great add-on.
Apart from that, it may help to avoid clones of 3rd party
modules
present in both add-ons and thus reducing storage and load
time.

I could perhaps also help with mouse related stuff (e.g.
recognition
of unknown pointer shapes automatically).

Cheers
Robert

On 22/03/2019, Oliver Edholm <oliver.edholm@gmail.com> wrote:
Hi! It's the creator of the Image Describer add-on. Looked a
little
bit
at
the code and it looks very well made.

I agree that we do similar things here. Yesterday I added OCR
to
the
Image
Describer backend for example (without knowing about this
add-on).

I'm also working on a free Captcha Solver which seems to be
something
Larry
has also looked into.

Also regarding the example of utilizing more methods from AI
to
these
problems it's also something I've been working on. You gave
the
example
of
icon classification, I trained a classifier for this ~2 weeks
ago
which
intend to eventually add to the Image Describer add-on.

I'm definitely open for some sort of collaboration here.

On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:


Wouldn't AI conflict with the objectDescriber add-on?
I mean, two add-ons with essentially the same functionality
is a
bit
strange.

However, I see certainly positive results when AI is
integrated.
Imagine a inaccessible application. A lot of times, it does
not
only
use text but also symbols and it would be great if general
once
could
be recognized.
- disc icons
- cog wheel/ wrench icons
- transport controls: play/fast forward/rewind icons
- home icon
- triangles (for drop down menus)
- check boxes, radio buttons, folder icons (= browse for
folder)
and
so
on.

Additionally, simple rectangles which could indicate form
fields
and
edit boxes. Compare the auto form detection used in Acrobat
Pro DC

(By the way, I make use (in my version of golden cursor)
changing
mouse icons to determine e.g. the boundaries of an edit box
or
scroll
bars or sliders.)

Another thing that I would like to see:
If one presses enter in a recognition result, the default
action
is
performed. So far so good but here are some possible
improvements:
Case 1, when the recognition result stays open.
- perform a new recognition and update the document to
reflect
possible
changes.
Case 2, when the navigator object changes and the document
closes.
- if the new navigator object is not a known class (e.g.
window or
system client or generic, without children) do as for case
1.

Best
Robert



On 21/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:

control nvda shift i is assigned to readfeeds add-on, but
feel
free
to
assign it since readfeeds is an add-on, maintained and
initially
creadted
by
me (smile).

Enviado desde mi iPhone


El 21 mar 2019, a las 8:17, Larry Wang
<larry.wang.801@gmail.com>
escribió:

AI recognition is not hard to add since it does similar
operation
on
an
image, I have found several engines too. But we need
another two
gestures,

is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I unassigned?


On 2019/3/21 12:53, Noelia Ruiz wrote:
OK, let's wait some hours, and tonight or tomorrow I will
try
to
post
the
add-on on the website if all is OK. For now I will upload
the
dev
version
with one link, and when you declare the add-on as stable
we
will
post
the
link to stable version too and will register the add-on
in
translation
system.
I would like if you coul add AI recognition of images,
for
instance
if
they contain a man, a room, blond or brown hair, etc.
This is a wonderful work!

El 21/03/2019 a las 1:47, Larry Wang escribió:
Hi everyone,

I am happy to release version 0.10 of online ocr addon.
Changes in this version include

Fix error using user's own api key in sougou API

Fix unknown panel in sougou API settings

Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon




Cheers,
Larry















Re: Online OCR addon #addonrequestreview

Larry Wang
 

Hi Noelia

1. Documentation problem mentioned above is corrected in 0.12

2. Access type is valid for Oliver's engine. That is because I provided a proxy for users without access to google, by default it connects directly to google .
3. I will look into how to use a checkbox. The checkbox in advanced panel may be a good candidate.

On 2019/3/28 15:10, Noelia Ruiz wrote:
Also, the plugins based on Oliver's work seems not to need an api key. Then I think that the combo box should be disabled, since the combo box label refers to api key, but options correspond to proxy.
About the details supported by plugins like Azure Analizer, I think that a list of checkbox may be better than a combo box.
Also, I saw that in some part imageInfo appears as not defined. This is normal, since the add-on is very complex and have a lot of plugins. Anyway, we can update the version and formatted documentation easily.

Cheers


El 27/03/2019 a las 23:05, Noelia Ruiz via Groups.Io escribió:
Hi, your suggestion sounds good. I don't use numeric keys and use laptop key with bloc uppercase as NVDA key, so I don't have problems with combinations, but I agree with you.

Cheers


El 27/03/2019 a las 22:16, Robert Hänggi escribió:
Hi Noelia,
I would swap the two gestures. NVDA+Alt for the normal functionality
and NVDA+Shift+Control for the clipboard because pressing three keys
is hard enough, no need for 4. Especially if you want to perform a
double gesture.

I wonder, shouldn't the NumPad be used, if available, after all, the
gestures for object navigation are situated there.
And some are free, such as NVDA+NumPlusSign, NVDA+Num3, NVDA+Num9
(I've seen the latter one in the dragAndDrop addOn but I don't know if
that is still maintained).

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, another suggestion:
NVDA+control+shift are used, when pressing r, to object navigator, and
when pressing p, for clipboard images. And alt+nvda, for navigator
object in case of image descriptor, and with r for clipboard. This may
be inconsistent. Maybe better to use control+nvda+shift for navigator
object and nvda+alt for clipboard?

Cheers


El 27/03/2019 a las 20:39, Noelia Ruiz via Groups.Io escribió:
Hi again, the add-on is on the website. Anyway, I recommend you to
format a bit the documentation, for example, you may want to put The
first heading level 3 after level 1 to 2, and also fix lists, since
after the log option the list finishes, etc.
I have put asterisks at start, so that compatibility info, author and
download links appear in a list, as done with other add-ons on the
website.
Also, the link using addonFiles address is used, to avoid issues in
translations.
When you declare the add-on as stable, we can register it to be
translated. Now translators can start with documentation.
https://addons.nvda-project.org/addons/onlineOCR.en.html

Cheers

https://addons.nvda-project.org/addons/onlineOCR.en.html
El 27/03/2019 a las 19:29, Noelia Ruiz via Groups.Io escribió:
Hi, basic review results after adding image describers:

- License and copyright: Pass.
- Security: pass.
- Documentation: Pass.
- User experience: pass with comments.

Image describers can provide text results (not in browse mode), and
this has to be fixed. Also, when pressing the gesture twice, NVDA can
report that there is active another recognizer.
Anyway, seing that author is listening and responding feed-back, and
that Oliver work has been used and mentioned in documentation, and
moreover the code shows a deep experience looking at NVDA relevant
code, and also that the main structure is done, and fail refers to
part of some features, and that this should be tested in different
languages, I think we can post it in the development section of
website to share widely in different communities, since the add-on is
useful now and will be improved.
Here is my log. Since no objections are expressed, I will post the
add-on now, and this is the log for emage describer errors:

INFO -
external:globalPlugins.onlineOCR.GlobalPlugin.getImageFromClipboard
(19:17:16.818):
(u'C:\\Users\\User\\Downloads\\ant.jpg',)
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:09.683):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:09.683):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing', u'food']},
u'requestId': u'e17fb346-8b0c-434e-b14f-66dabd794424'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:13.302):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:13.302):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing', u'food']},
u'requestId': u'22101b48-0656-419b-84a4-5a69c85ed565'}
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:39.265):
'SimpleTextResult' object is not iterable
ERROR -
external:globalPlugins.onlineOCR.OnlineImageDescriberHandler.callback
(19:18:39.265):
{u'metadata': {u'width': 272, u'format': u'Png', u'height': 201},
u'description': {u'captions': [{u'text': u'a drawing of a face',
u'confidence': 0.3644558611090458}], u'tags': [u'drawing', u'food']},
u'requestId': u'94c34247-39e6-4242-928c-3afcf1cd670e'}



El 27/03/2019 a las 18:36, Noelia Ruiz via Groups.Io escribió:
Yes, I have seen an error with text result in browse mode with the
image describer plugin. Anyway, I think it has a lot of feedback and
just it could be added to development section of website, since the
major part is done, it has passed basic review, the quality in
general is good, and no major changes in documentation are going to
happen for now. If you dont have objections, I will post this in
about an hour for wide testing, and if minor version are released
they will be posted, but now translators can start to translate
documentation so that different communities can test the add-on with
some quality and feedback, as a first filter.
Cheers""

Enviado desde mi iPhone

El 27 mar 2019, a las 17:43, Robert Hänggi <aarjay.robert@gmail.com>
escribió:

Hi
I'm glad that the addOn makes such huge steps.

The browsable message doesn't seem to work for NVDA+Alt+P if I swap
the gestures.

The description might need some formatting, at least some line breaks.

That's the result with azure:

Categories: others_ outdoor_ text_sign
This image does not contain adult content This image contains racy
content
Dominant foreground color is Red. Dominant background color is White.
Hex code of accent color is 000000. Dominant colors: White The image
is not black and white.
Tags: screenshot design pixel vector typography
The image is not a clip-art. The image is not a lineDrawing
Descriptions: a close up of a device

(Note there is no racy content, LOL. It's the record button of
Audacity.)

Empty tags could be omitted.
The accent colour could perhaps be represented as RGB or expressed
with the NVDA colour descriptions.

Anyways, great work.

Robert

On 27/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:
Hi, I will try to post the development version of this wonderful
add-on on the website after job, in the evening, in about 2 hours or
so.
Personally, I feel a deep emotion for this. Of course, this doesn't
replace a human description where we can ask questions and so on.
I think this is an open door also for students in scientific degrees,
where people can have a lot of problems and some of us have listened
things like: "Why don't you try to drive a car?". Since some things
can be identified by machine learning, or at least this can help us,
and this is very important.
I have a lot of job and hope other reviewers can help with new
add-ons too.

Cheers


2019-03-27 16:15 GMT+01:00, Larry Wang <larry.wang.801@gmail.com>:
Hi everyone,

Version 0.11 of online ocr addon is released.
Changes in this version include:
Change addon summary to online image describer
Added image description capability

NVDA+Alt+P Recognize current navigator object Then read result. If
pressed
twice, open a virtual result document.

Control+Shift+NVDA+P Recognizes image in clipboard . Then read
result. If
pressed twice, open a virtual result document.
Here are three engines available.

### Machine Learning Engine by Oliver Edholm
It's a free engine gives description of an image.
If there is text inside it will do OCR on the image.
There are two settings for this engine.
* Language of result:
English by default. If you configure another language than
English, the
description could have translation issues because it's automatically
generated by machine translation service.

Security:
* The images are sent to a script hosted on the Google Cloud
Platform for
analysis. After the analysis the image gets removed from the
server and
will never be seen again.

The author of this addon has setup a proxy on www.nvdacn.com for
users
who
cannot access google service access.
If you want to use this proxy please chose Use proxy on
www.nvdacn.com in
access type settings.

If you want to use your own key in the following two Microsoft
engines.
Please follow the guide in Microsoft Azure OCR section.
### Microsoft Azure Image Analyser
This engine extracts a rich set of visual features based on the
image
content.
This engine is english only by now.

Visual Features include:
Adult - detects if the image is pornographic in nature (depicts
nudity or
a
sex act). Sexually suggestive content is also detected.
Brands - detects various brands within an image, including the
approximate
location. The Brands argument is only available in English.
Categories - categorizes image content according to a taxonomy
defined in
documentation.
Color - determines the accent color, dominant color, and whether
an image
is black&white.
Description - describes the image content with a complete sentence
in
supported languages.
Faces - detects if faces are present. If present, generate
coordinates,
gender and age.
ImageType - detects if image is clip art or a line drawing.
Objects - detects various objects within an image, including the
approximate location. The Objects argument is only available in
English.
Tags - tags the image with a detailed list of words related to the
image
content.

Some features also provide additional details:

Celebrities - identifies celebrities if detected in the image.
Landmarks - identifies landmarks if detected in the image.

### Microsoft Azure Image describer

This engine generates a description of an image in human readable
language
with complete sentences. The description is based on a collection of
content tags, which are also returned by the operation. More than
one
description can be generated for each image. Descriptions are
ordered by
their confidence score.
There are two settings for this engine.
* Language
The language in which the service will return a description of the
image.
English by default.

* Max Candidates
Maximum number of candidate descriptions to be returned. The
default is
1.


Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.11-dev/onlineOCR-0.11-dev.nvda-addon




Cheers,
Larry



On Sat, Mar 23, 2019 at 8:30 PM Noelia Ruiz <nrm1977@gmail.com>
wrote:

In case it's useful, more suggestion:
- In the documentation presented in input mode for
NVDA+shift+control+r
announces that recognizes the content and opens a virtual result,
but
this is configurable. This may be mentioned indicating that the
result
could be spoken and braillified or presented in a virtual document.
- Regarding the code, I have seen that sometimes it's used
something
like:
if something: return
else...
But I think in some cases else after return is not needed, since
if the
previous condition exists, the script return and no more can
happen. I
made this mistake years ago and Mesar Hameed, a great person and
developer, who created initially the scripts used for the
translation
system and make add-ons translatable by NVDA's translators
exchanging
messages between add-ons stored on Bitbucket team account and the
translators repository, also first creator of guidelines for
reviewers
and authors and so on, fixed me teaching this mistake made by me,
in
case you are interested.
When you want, for example when image descriptions are added, we
can
post the add-on on the website, and if this is joined with
ImageDescriber, we will fix things later. I say this since maybe
complicated to join these add-ons, imo, at least currently, since
the
two interfaces are very different. Anyway, if you want to wait to
post
the add-on, this is OK.


Cheers

El 22/03/2019 a las 11:49, Noelia Ruiz via Groups.Io escribió:
Sorry, another thought: I think that Larry could accept pull
request
if the features contained in ImageDescribed are properly
integrated in
online_ocr. For example, I don't know a good idea just to join
the two
add-ons, and I would understand that for any reason this could
be done
and they remain as independent add-ons. Please don't feel any
pressure
for our part (mentioning Robert and me).
Just for clarify

2019-03-22 11:34 GMT+01:00, Noelia Ruiz via Groups.Io
<nrm1977=gmail.com@groups.io>:
Hi, this is great news. I think that Larry may accept pull
requests,
since the add-on gui seems to be flexible, accepting profiles,
integrated in NVDA settings, and accepting content recog
messages,
etc. Also, I think that creating pull request in this add-on
can be
easy, using the abstract class for plugins. So, imo, if both you
agree, I would vote for adding just one joined add-on on the
website.
ImageDescriber could be a better name since it's more generic,
though
I definitely would use Larry's UI and framework. What do you
think?

Cheers and thanks both for your work.

2019-03-22 11:25 GMT+01:00, Robert Hänggi
<aarjay.robert@gmail.com>:
Oliver, I'm very glad that you're open to collaboration.
I think you're joined efforts would lead to a great add-on.
Apart from that, it may help to avoid clones of 3rd party
modules
present in both add-ons and thus reducing storage and load time.

I could perhaps also help with mouse related stuff (e.g.
recognition
of unknown pointer shapes automatically).

Cheers
Robert

On 22/03/2019, Oliver Edholm <oliver.edholm@gmail.com> wrote:
Hi! It's the creator of the Image Describer add-on. Looked a
little
bit
at
the code and it looks very well made.

I agree that we do similar things here. Yesterday I added OCR
to
the
Image
Describer backend for example (without knowing about this
add-on).

I'm also working on a free Captcha Solver which seems to be
something
Larry
has also looked into.

Also regarding the example of utilizing more methods from AI to
these
problems it's also something I've been working on. You gave the
example
of
icon classification, I trained a classifier for this ~2 weeks
ago
which
intend to eventually add to the Image Describer add-on.

I'm definitely open for some sort of collaboration here.

On Fri, Mar 22, 2019 at 01:39 AM, Robert Hänggi wrote:


Wouldn't AI conflict with the objectDescriber add-on?
I mean, two add-ons with essentially the same functionality
is a
bit
strange.

However, I see certainly positive results when AI is
integrated.
Imagine a inaccessible application. A lot of times, it does
not
only
use text but also symbols and it would be great if general
once
could
be recognized.
- disc icons
- cog wheel/ wrench icons
- transport controls: play/fast forward/rewind icons
- home icon
- triangles (for drop down menus)
- check boxes, radio buttons, folder icons (= browse for
folder)
and
so
on.

Additionally, simple rectangles which could indicate form
fields
and
edit boxes. Compare the auto form detection used in Acrobat
Pro DC

(By the way, I make use (in my version of golden cursor)
changing
mouse icons to determine e.g. the boundaries of an edit box or
scroll
bars or sliders.)

Another thing that I would like to see:
If one presses enter in a recognition result, the default
action
is
performed. So far so good but here are some possible
improvements:
Case 1, when the recognition result stays open.
- perform a new recognition and update the document to reflect
possible
changes.
Case 2, when the navigator object changes and the document
closes.
- if the new navigator object is not a known class (e.g.
window or
system client or generic, without children) do as for case 1.

Best
Robert



On 21/03/2019, Noelia Ruiz <nrm1977@gmail.com> wrote:

control nvda shift i is assigned to readfeeds add-on, but
feel
free
to
assign it since readfeeds is an add-on, maintained and
initially
creadted
by
me (smile).

Enviado desde mi iPhone


El 21 mar 2019, a las 8:17, Larry Wang
<larry.wang.801@gmail.com>
escribió:

AI recognition is not hard to add since it does similar
operation
on
an
image, I have found several engines too. But we need
another two
gestures,

is NVDA+Ctrl+Shift+I NVDA+Ctrl+Alt+Shift+I unassigned?


On 2019/3/21 12:53, Noelia Ruiz wrote:
OK, let's wait some hours, and tonight or tomorrow I will
try
to
post
the
add-on on the website if all is OK. For now I will upload
the
dev
version
with one link, and when you declare the add-on as stable we
will
post
the
link to stable version too and will register the add-on in
translation
system.
I would like if you coul add AI recognition of images, for
instance
if
they contain a man, a room, blond or brown hair, etc.
This is a wonderful work!

El 21/03/2019 a las 1:47, Larry Wang escribió:
Hi everyone,

I am happy to release version 0.10 of online ocr addon.
Changes in this version include

Fix error using user's own api key in sougou API

Fix unknown panel in sougou API settings

Here is the direct download link

https://github.com/larry801/online_ocr/releases/download/0.10-dev/onlineOCR-0.10-dev.nvda-addon





Cheers,
Larry





























compatability checker

 

Hi all.

With the upcoming releases of nvda to have issues with addons, I do wander ahead of time that there should be some tool a user and or developer should use to check addons to see if they meet any new specifications of nvda going forward, not for this release but for all future releases.

The tool would have to have the following features.

1.  the ability to be able to check addons against spaciffic versions, thatwould include all current releases, and stable additions, including previous releases where appropriate, alpha builds,beta builds and rcs, stable versions, and the current snapshot if needed.

The ability to check against a spaciffic nvda version, windows version, etc where applicable at least till the end of windows 8 maybe and who knows.

A check for future windows versions, including windows 10 insider builds.

Version of python used where applicable, and other libraries when applicable.

Finally, the ability to save results to a file, either text or nicely converted html, and have it show what is wrong, and what needs to be done to fix it.

I do realise that this would be probably using an online database or something.

And I am not sure about how this would work, for everything but it would be good if something like this existed.

For nvda, at least for all versions that have compatibility flags in them, maybe check against that though those can be moved.

And then you have changes and I do know that git changelogs, etc do get uploaded for things I think and whatsnews.

For the rest I am just woffling.

This could be an nvda addon, but maybe it should be an external tool initially, but maybe that could be part of nvda and be able to be envoked from a developer menu inside tools or a check addon compatability command like in firefox.

On the other side, if you didn't want it in nvda itself, an external tool wouldn't need to be a python program at all, it could be in just about anything.


Re: Online OCR addon #addonrequestreview

Larry Wang
 

Hi all,

0.12 version of online image describer addon is released

Changes in this version include

Fixed browse mode message of Microsoft Azure Image Describer
The accent color is now represented as NVDA colour descriptions.
Improved result format of Microsoft Azure Image Analyzer
Improve document according to review comments
Fixed gesture inconsistency.
Control+Shift+NVDA for clipboard while NVDA+ALT for navigator object
Fix missing imageInfo error while recognizing.


Here is the direct download link.

https://github.com/larry801/online_ocr/releases/download/0.12-dev/onlineOCR-0.12-sdev.nvda-addon

7761 - 7780 of 14680